Hello everyone! I hope this video has helped solve your questions and issues. This video is shared because a solution has been found for the question/problem. I create videos for questions that have solutions. If you have any other issues, feel free to reach out to me on Instagram: / ky.emrah
Below, you can find the text related to the question/problem. In the video, the question will be presented first, followed by the answers. If the video moves too fast, feel free to pause and review the answers. If you need more detailed information, you can find the necessary sources and links at the bottom of this description. I hope this video has been helpful, and even if it doesn't directly solve your problem, it will guide you to the source of the solution. I'd appreciate it if you like the video and subscribe to my channel!Remove all punctuation AND the values after it at end of string in R
I have a ID variable that comes from 35 different hospitals, so has varying different arrangements of the variable, and sometimes it has the same root ID number with a secondary line number - e.g. -1, /a, _1 etc.
I want to remove the punctuation, and whatever comes after that punctuation, leaving just the root ID number.
I have currently managed to write out individual lines of code for each different iteration, but I was wondering if there was a more elegant way so that next year when the data comes in I don't need to check for different arrangements?
On someone else's question I managed to find a way to remove the brackets and all the text within the brackets, but I can't seem to figure out how to manipulate it for my purposes
df$patid - gsub("\\s*\\([^\\)]+\\)","",df$patid)
df$patid - gsub("\\s*\\([^\\)]+\\)","",df$patid)
I tried these two codes without success
df$patid - gsub("\\[:punct:]s*$","", df$patid)
df$patid - gsub("\\[:alnum:]s*$","", df$patid)
df$patid - gsub("\\[:punct:]s*$","", df$patid)
df$patid - gsub("\\[:alnum:]s*$","", df$patid)
I also tried the clean function, which removed all the punctuation, but kept the numbers/characters after them, so that wasn't it.
clean
example of my current code (not all possible iterations) - These do work
example of my current code (not all possible iterations)
df$patid - gsub("\\-1$", "", df$patid)
df$patid - gsub("\\-2$", "", df$patid)
df$patid - gsub("\\-3$", "", df$patid)
df$patid - gsub("\\-a$", "", df$patid)
df$patid - gsub("\\-A$", "", df$patid)
df$patid - gsub("\\-b$", "", df$patid)
df$patid - gsub("\\-B$", "", df$patid)
df$patid - gsub("\\b", "", df$patid)
df$patid - gsub("\\/dd", "", df$patid)
df$patid - gsub("\\-1$", "", df$patid)
df$patid - gsub("\\-2$", "", df$patid)
df$patid - gsub("\\-3$", "", df$patid)
df$patid - gsub("\\-a$", "", df$patid)
df$patid - gsub("\\-A$", "", df$patid)
df$patid - gsub("\\-b$", "", df$patid)
df$patid - gsub("\\-B$", "", df$patid)
df$patid - gsub("\\b", "", df$patid)
df$patid - gsub("\\/dd", "", df$patid)
Am not tied to gsub, am open to different methods.
gsub
Example of ID numbers
patid - c("MB-13-169454", "MB-13-179455", "MB-13-212235.1", "MB-13-212235.2", "MB-13-224683", "570548260-2", "570548260-3", "1458629P-2", "1139093D-2", "8253015N/2", "8253015N/3", "M255858/1", "M255858/2", "8494392Q/2", "9296741B/2", "04152341421/A", "04152341421/B", "04152640475/B", "04152821164/A", "G140381883_1", "G140381883_2", "G140880774_1", "G140880774_2")
patid - c("MB-13-169454", "MB-13-179455", "MB-13-212235.1", "MB-13-212235.2", "MB-13-224683", "570548260-2", "570548260-3", "1458629P-2", "1139093D-2", "8253015N/2", "8253015N/3", "M255858/1", "M255858/2", "8494392Q/2", "9296741B/2", "04152341421/A", "04152341421/B", "04152640475/B", "04152821164/A", "G140381883_1", "G140381883_2", "G140880774_1", "G140880774_2")
Apologies if this has been answered somewhere already
Tags: r,regex,string-substitutionSource of the question:
https://stackoverflow.com/questions/7...
Question and source license information:
https://meta.stackexchange.com/help/l...
https://stackoverflow.com/
Информация по комментариям в разработке