r语言 - 如何使用dplyr对这些变量进行分组以生成分组摘要?



这是我的输出:

structure(list(Students = c(300L, 1600L, 100L, 90L, 2000L, 200L, 
300L, 340L, 1500L, 500L, 360L, 820L, 150L, 1380L, NA, 360L, 400L, 
1000L, 1600L, 142L, 250L, 2000L), Students_Primary = c(150L, 
NA, 100L, 90L, 800L, NA, NA, 150L, NA, 250L, 220L, 400L, NA, 
750L, NA, NA, NA, 600L, NA, 142L, NA, 500L), Chinese_Spoken = c("Mandarin", 
"Mandarin", "Mandarin", "Mandarin", "Mandarin", "Mandarin", "Mandarin", 
"Mandarin", "Mandarin", "Mandarin", "Cantonese", "Mandarin", 
"Mandarin", "Mandarin", "Mandarin", "Mandarin", "Mandarin", "Mandarin", 
"Mandarin", "Both", "Mandarin", "Both"), Chinese_Written = c("Simplified", 
"Traditional", "Simplified", "Traditional", "Both", "Traditional", 
"Traditional", "Simplified", "Simplified", NA, "Traditional", 
"Both", NA, "Both", "Both", "Simplified", "Both", "Traditional", 
"Traditional", "Traditional", "Simplified", "Both")), class = "data.frame", row.names = c(NA, 
-22L))

我想总结一下有多少学生使用不同的中文写作,所以我试着用下面的代码来做:

school %>% 
select(Chinese_Written, Students) %>%
group_by(Chinese_Written) %>% 
arrange(Chinese_Written) %>% 
na.omit()

它吐出这个:

Chinese_Written Students
<chr>              <int>
1 Both                2000
2 Both                 820
3 Both                1380
4 Both                 400
5 Both                2000
6 Simplified           300
7 Simplified           100
8 Simplified           340
9 Simplified          1500
10 Simplified           360
11 Simplified           250
12 Traditional         1600
13 Traditional           90
14 Traditional          200
15 Traditional          300
16 Traditional          360
17 Traditional         1000
18 Traditional         1600
19 Traditional          142

他们没有被组合在一起有什么原因吗?我想要所有的"both"、"simplified"one_answers"traditional">

group_by单独不做任何事情,它使下面的命令按分组。所以你可以在Chinese_Written

之后用summarise来代替sum变量Students
library(dplyr)
school %>% 
group_by(Chinese_Written) %>% 
summarise(Students = sum(Students,na.rm = TRUE))
# A tibble: 4 x 2
Chinese_Written Students
<chr>              <int>
1 Both                6600
2 Simplified          2850
3 Traditional         5292
4 NA                   650