r语言 - 按列表统计关键字个数



我有一个名为colors的数据集,我有兴趣根据我创建的列表(coloryellow, colorblue, colorwhite)找到一些关键字。这是一个数据集的例子:

<表类> 用户 消息 23456 这个颜色很亮! 大多数女孩喜欢colorpink 99999 我正在休息 9877 颜色黄如太阳

有些事你可以做。

some_colours <- c("colouryellow", "colourblue", "colourwhite")
some_col_regex <- paste0("\b(", paste(some_colours, collapse = "|"), ")\b")
grepl(some_col_regex, colour$MESSAGE)
# [1]  TRUE FALSE FALSE  TRUE
lengths(regmatches(colour$MESSAGE, gregexpr(some_col_regex, colour$MESSAGE)))
# [1] 1 0 0 1
table(unlist(regmatches(colour$MESSAGE, gregexpr(some_col_regex, colour$MESSAGE))))
# colouryellow 
#            2 

数据
colour <- structure(list(USER = c(23456L, 31245L, 99999L, 9877L), MESSAGE = c("The colouryellow is very bright!", "Most girls like colourpink", "I am having a break", "The colouryellow is like the sun")), class = "data.frame", row.names = c(NA, -4L))

最新更新