我正试图通过检测注释列中的特定单词来提取每个产品的数据使用字符串检测
Product <- c("a","a","a","a","a","a","b","b","b","b","c","c","c")
Comments <-c("The product enrolled"," product created","The product reviewed"," probable sale","probable sale","failed"
,"The product enrolled"," probable","The product failed"," product failed"
,"The product enrolled"," probable","The product failed")
sales<- data.frame(Product,Comments)
我正试图提取所有产品的报告;可能的";使用str_detect将注释中的单词作为数据帧并且在可能作为不同的数据帧之后
预期输出
数据帧1:在可能的之前
Product Comments
a The product enrolled
a product created
a The product reviewed
b The product enrolled
c The product enrolled
数据帧2:可能的
a probable sale
a probable sale
b probable
c probable
数据帧3:在可能的之后
b The product failed
c The product failed
使用dplyr
(和grepl
,因为它在这里工作得很好(:
sales$isprobable <- grepl("probable", sales$Comments)
library(dplyr)
sales %>%
group_by(Product) %>%
filter(!cumany(isprobable)) %>%
ungroup()
# # A tibble: 5 x 3
# Product Comments isprobable
# <chr> <chr> <lgl>
# 1 a "The product enrolled" FALSE
# 2 a " product created" FALSE
# 3 a "The product reviewed" FALSE
# 4 b "The product enrolled" FALSE
# 5 c "The product enrolled" FALSE
sales %>%
filter(isprobable)
# Product Comments isprobable
# 1 a probable sale TRUE
# 2 a probable sale TRUE
# 3 b probable TRUE
# 4 c probable TRUE
sales %>%
group_by(Product) %>%
filter(!isprobable & lag(isprobable)) %>%
ungroup()
# # A tibble: 3 x 3
# Product Comments isprobable
# <chr> <chr> <lgl>
# 1 a failed FALSE
# 2 b The product failed FALSE
# 3 c The product failed FALSE