我有一个导入到R中的注释列表。下面是一些注释如何导入的示例-
9. This is some string number 1
9This is some string number 2
9 This is some string number 3
9-This is some string number 4
67-68 This is some string number 5
注意,我将注释保存到一个名为some_str
的变量中
我的目标是打印出每一行,不在行的开头加数字。像这样-
This is some string number 1
This is some string number 2
This is some string number 3
This is some string number 4
This is some string number 5
我已经使用下面的代码来处理上面的第一行(9. This is some string number 1
(-
pattern = "([0-9][.][ ])"
str_replace(some_str, pattern, "")
输出This is some string number 1
然而,我在匹配/删除其他行时遇到了困难。例如,如果我创建图案CCD_;9T";关于第二行,我如何只删除数字9。
最后还要注意的是,我正在尝试删除仅在评论开头的数字。例如,如果第3行有以下注释-
"9 This is some string number 2. 2 dogs came to town"
我只想删除评论开头的9。我不想在句号之后删除2。
另一个解决方案:
library(tidyverse)
dat <- data.frame(x = c("67,68 This is my test",
"67-68 This is my test",
"8 This is my test"))
dat %>%
mutate(x2 = str_replace(x, pattern = "^[^A-Z]*", ""))
它给出:
x x2
1 67,68 This is my test This is my test
2 67-68 This is my test This is my test
3 8 This is my test This is my test
这里是一个基本的R解决方案
使用的模式是
pattern <- "^[-[:digit:][:punct:][:space:]]*"
它适用于所有发布的测试用例。
sub(pattern, "", x)
#[1] "This is some string number 1" "This is some string number 2"
#[3] "This is some string number 3" "This is some string number 4"
#[5] "This is some string number 5"
相同的正则表达式适用于最后一个字符串:
sub(pattern, "", y)
#[1] "This is some string number 2. 2 dogs came to town"
包stringr
的解决方案可以是
library(stringr)
str_remove(x, pattern)
str_remove(y, pattern)
数据
x <- scan(what = character(), text = "
9. This is some string number 1
9This is some string number 2
9 This is some string number 3
9-This is some string number 4
67-68 This is some string number 5
", sep = "n")
y <- "9 This is some string number 2. 2 dogs came to town"
我们可以使用sub
sub("^[-0-9. ]+", "", v1)
#[1] "This is some string number 1" "This is some string number 2" "This is some string number 3" "This is some string number 4"
#[5] "This is some string number 5"
数据
v1 <- c("9. This is some string number 1", "9This is some string number 2",
"9 This is some string number 3", "9-This is some string number 4",
"67-68 This is some string number 5")
stringr::str_extract("9. This is some string number 1 2. 2 dogs came to town", "^([0-9][.][ ])")
这应该行得通
只需将您的模式更改为:
^([0-9][.][](