R数据帧:按组聚合并转换为频率计数



我是R的新手,想知道是否有人能帮我解决一个问题:

我有一个这样的数据帧:

refer_sm<1><1>
panelst_id 类型refer-se
1 HP
1 HP 0 1
1 PB 0
2 PB 0
3 TN 1 0
HP

这里有一个tidyverse选项。创建一列1(n(,然后使用pivot_widernames_from指定为"type",将values_from指定为"n"one_answers"refer"列,将values_fn指定为sumvalues_fill指定为0(在没有组合的情况下返回0-默认情况下返回NA(

library(dplyr)
library(tidyr)
df1 %>%
mutate(n = 1) %>% 
pivot_wider(names_from = type, values_from = c(n, refer_sm, refer_se),
values_fn = sum, values_fill = 0)

-输出

# A tibble: 3 x 10
#  panelist_id  n_HP  n_PB  n_TN refer_sm_HP refer_sm_PB refer_sm_TN refer_se_HP refer_se_PB refer_se_TN
#        <int> <dbl> <dbl> <dbl>       <int>       <int>       <int>       <int>       <int>       <int>
#1           1     2     1     0           1           0           0           1           1           0
#2           2     0     1     0           0           0           0           0           1           0
#3           3     1     0     1           1           0           1           0           0           0

数据

df1 <- structure(list(panelist_id = c(1L, 1L, 1L, 2L, 3L, 3L), type = c("HP", 
"HP", "PB", "PB", "TN", "HP"), refer_sm = c(1L, 0L, 0L, 0L, 1L, 
1L), refer_se = c(0L, 1L, 1L, 1L, 0L, 0L)), 
class = "data.frame", row.names = c(NA, 
-6L))

使用reshape+aggregate的基本R选项

reshape(
aggregate(. ~ panelist_id + type, cbind(df, n = 1), sum),
idvar = "panelist_id",
timevar = "type",
direction = "wide"
)

给出

panelist_id refer_sm.HP refer_se.HP n.HP refer_sm.PB refer_se.PB n.PB
1           1           1           1    2           0           1    1
2           3           1           0    1          NA          NA   NA
4           2          NA          NA   NA           0           1    1
refer_sm.TN refer_se.TN n.TN
1          NA          NA   NA
2           1           0    1
4          NA          NA   NA

或者,我们可以使用xtabs

cbind(
xtabs(~ panelist_id + type, df),
xtabs(refer_sm ~ ., aggregate(refer_sm ~ panelist_id + type, df, sum)),
xtabs(refer_se ~ ., aggregate(refer_se ~ panelist_id + type, df, sum))
)

给出

HP PB TN HP PB TN HP PB TN
1  2  1  0  1  0  0  1  1  0
2  0  1  0  0  0  0  0  1  0
3  1  0  1  1  0  1  0  0  0

最新更新