我已经看到了几种获取数据和按组创建计数的方法,但我想做的有点复杂......我有一个类似于下面的数据集:
d <- data.frame(ID=c("1ef","3ic","9sd"),
CI_Region=c("Bay Area","North Sierra","Central Valley"),
Q18_429=c("Not a threat","Slightly serious","Very Serious"),
Q18_430=c("Extremely serious","Somewhat serious","Slightly serious"),
Q18_431=c("Slightly serious","Unknown","No Answer"))
我想按CI_Region分组,然后按问题计算每个响应的计数(例如"不是威胁"、"略严重"等)。
最终结果是一个表格,其中按问题和 CI 区域显示响应类别计数。所以我能够看到湾区-Question18_429-不是威胁= 1。
提前感谢!
d <- data.frame(ID=c("1ef","3ic","9sd"),
CI_Region=c("Bay Area","North Sierra","Central Valley"),
Q18_429=c("Not a threat","Slightly serious","Very Serious"),
Q18_430=c("Extremely serious","Somewhat serious","Slightly serious"),
Q18_431=c("Slightly serious","Unknown","No Answer"))
将数据重塑为更整洁的格式使其更容易。
library(tidyr)
gather(d, question, response, -ID, -CI_Region) %>%
group_by(CI_Region, question, response) %>%
tally()
CI_Region question response n
(fctr) (fctr) (chr) (int)
Bay Area Q18_429 Not a threat 1
Bay Area Q18_430 Extremely serious 1
Bay Area Q18_431 Slightly serious 1
Central Valley Q18_429 Very Serious 1
Central Valley Q18_430 Slightly serious 1
Central Valley Q18_431 No Answer 1
North Sierra Q18_429 Slightly serious 1
North Sierra Q18_430 Somewhat serious 1
North Sierra Q18_431 Unknown 1
这就是你所追求的吗?