根据R中的字符串模式选择行



假设我有下一个数据:

df <- data.frame(name = c("TO for", "Turnover for people", "HC people", 
"Hello world", "beenie man", 
"apple", "pears", "TO is"),
number = c(1, 2, 3, 4, 5, 6, 7, 8))

我想根据行字符串模式过滤df,如果name列的行以c("TO", "Turnover", "HC")开头,则过滤else remove。

以下代码给我一条警告信息:

library(data.table)
test <- df[df$name %like% c("TO", "Turnover", "HC"), ]

控制台输出:

Warning message:
In grepl(pattern, vector, ignore.case = ignore.case, fixed = fixed) :
el argumento 'pattern' tiene tiene longitud > 1 y sólo el primer elemento será usado

预期输出应该如下所示:

# name                   number
# TO for                   1
# Turnover for people      2
# HC people                3
# TO is                    8   

还有其他方法可以做到这一点吗?

%like%未矢量化。我们可能需要通过patternvectorReduce将其循环到单个逻辑向量

i1 <- Reduce(`|`, lapply(c("TO", "Turnover", "HC"), `%like%`, vector = df$name))
df[i1,]
#                 name number
#1              TO for      1
#2 Turnover for people      2
#3           HC people      3
#8               TO is      8

或者这可以通过grepl通过|vector折叠成单个字符串来实现

pat <- paste(c("TO", "Turnover", "HC"), collapse= "|")
df[grepl(pat, df$name),]
#                 name number
#1              TO for      1
#2 Turnover for people      2
#3           HC people      3
#8               TO is      8

也可用于%like%

df[df$name %like% pat,]

最新更新