我希望突变一个名为SF_COUNT
的新列,它是每组行数的计数(ID
(,其中列类型包含"SF
">
一个可复制的例子如下:
df <- data.frame(ID = c(1234,1234,1234,4567,4567,4567,4567,8900,8900,8900),type = c('RF','SF','SF','RF','SF','SF','SF','RF','SF','SF'))
我的最终数据帧看起来像:
final_df <- data.frame(ID = c(1234,1234,1234,4567,4567,4567,4567,8900,8900,8900),type = c('RF','SF','SF','RF','SF','SF','SF','RF','SF','SF'), SF_COUNT = c(2,2,2,3,3,3,3,2,2,2))
请问我如何在dplyr中实现这一点?
按'ID'分组后,在mutate
中获取逻辑vector
(type == 'SF'
(的sum
,创建新列
library(dplyr)
df <- df %>%
group_by(ID) %>%
mutate(SF_COUNT = sum(type == 'SF', na.rm = TRUE))
如果是子字符串,则使用str_detect
library(stringr)
df <- df %>%
group_by(ID) %>%
mutate(SF_COUNT = sum(str_detect(type, 'SF'), na.rm = TRUE))