r-找出dplyr组中事件的频率

我有一个分组df，具有不同长度的组。我想统计每组中的y/n个事件。因此，如果我有以下内容：

df <- data.frame(group = rep(1:4,times=c(20,10,17,8)),
outcome = rep(c("yes","yes","no","yes","no"),times = 11))

我想用一种方式来总结这一点，我可以看到每组中"是"one_answers"否"的频率。类似于：

df %>% group_by(group) %>%
summarise(freqyes = (. %>% filter(outcome=="yes") %>% n()) / n(),
freqyes = (. %>% filter(outcome=="no") %>% n()) / n())

但是，这不起作用。

每组的"是"one_answers"否"加起来应该是100。

谢谢。

我们可以count，然后用group计算比例。

library(dplyr)
df %>% count(group, outcome) %>% group_by(group) %>% mutate(n = n/sum(n) * 100)
#  group outcome   n
#  <int> <fct>   <dbl>
#1     1 no       40  
#2     1 yes      60  
#3     2 no       40  
#4     2 yes      60  
#5     3 no       35.3
#6     3 yes      64.7
#7     4 no       50  
#8     4 yes      50

在R基中，我们可以使用table和prop.table。

prop.table(table(df), 1) * 100
#    outcome
#group       no      yes
#    1 40.00000 60.00000
#    2 40.00000 60.00000
#    3 35.29412 64.70588
#    4 50.00000 50.00000

相关内容

最新更新

热门标签：