r语言 - 如何返回短信中使用禁止词的学生ID(更新) - r - How to return ID of student who used ban word in text message (Updated) 小贝子编程网

我有一个数据框架

ID_Student                    Text_Message
1   John Doe Hell like I want to fxxk around
2 Peter Gynn                 You such an ass
3 Jolie Hope                      Go to hell

我有向量

> Ban_words
[1] "fxxk" "ass"  "hell"

如何返回使用违禁词汇的学生的ID和他们使用的词汇?什么好主意吗?

目前为止我的解决方案。

数据

ID_Student <- c("John Doe", "Peter Gynn", "Jolie Hope", "Mike Tyson")
Text_Message <- c("hell I want to fxxk around", "You such an ass", "Go to hell", "I love you")
Ban_words <- c("fxxk", "ass", "hell")
Student_Message <-data.frame(ID_Student,Text_Message)

数据框应该像这样

ID_Student               Text_Message
1   John Doe hell I want to fxxk around
2 Peter Gynn            You such an ass
3 Jolie Hope                 Go to hell
4 Mike Tyson                 I love you

代码

for (i in Ban_words){
Detention_List<-Student_Message %>% filter (grepl(i, Text_Message))%>%
pull(ID_Student)
print(Detention_List)
}

[1] "John Doe"
[1] "Peter Gynn"
[1] "John Doe"   "Jolie Hope"

所以，对于乐队单词'fxxk'，只有John使用了它。但是对于"hell"这个词，约翰和朱莉都用了。

我们可以把所有的"将paste(collapse = "|")放入一个正则表达式中，然后使用grepl对这个正则表达式进行过滤。然后是pull，有名字的向量。正如你所看到的，这返回了所有学生的名字，除了"迈克"，因为他没有使用禁止词(见我编辑的数据)。

library(dplyr)
df %>% filter (grepl(paste(Ban_words, collapse = '|'), Text_Message)) %>%
pull(student)
[1] "John Doe"   "Peter Gyn"  "Jolie Hope"

df<-data.frame(student=c('John Doe', 'Peter Gyn', 'Jolie Hope', 'Mike'), Text_Message=c('I want to fxxk around', 'You such an ass', 'Go to hell', "I love you"))
> df
student          Text_Message
1   John Doe I want to fxxk around
2  Peter Gyn       You such an ass
3 Jolie Hope            Go to hell
4       Mike            I love you
Ban_words<-c("fxxk", "ass",  "hell")

r语言 - 如何返回短信中使用禁止词的学生ID(更新)

相关内容

最新更新

热门标签：