我是R的新手,想知道是否有人能帮我解决一个问题:
我有一个这样的数据帧:
panelst_id | 类型 | refer_smrefer-se | ||
---|---|---|---|---|
1 | HP | |||
1 | HP | 0 | 1 | |
1 | PB | 0 | <1>||
2 | PB | 0 | <1>||
3 | TN | 1 | 0 | |
HP |
这里有一个tidyverse
选项。创建一列1(n
(,然后使用pivot_wider
将names_from
指定为"type",将values_from
指定为"n"one_answers"refer"列,将values_fn
指定为sum
,values_fill
指定为0(在没有组合的情况下返回0-默认情况下返回NA
(
library(dplyr)
library(tidyr)
df1 %>%
mutate(n = 1) %>%
pivot_wider(names_from = type, values_from = c(n, refer_sm, refer_se),
values_fn = sum, values_fill = 0)
-输出
# A tibble: 3 x 10
# panelist_id n_HP n_PB n_TN refer_sm_HP refer_sm_PB refer_sm_TN refer_se_HP refer_se_PB refer_se_TN
# <int> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int> <int>
#1 1 2 1 0 1 0 0 1 1 0
#2 2 0 1 0 0 0 0 0 1 0
#3 3 1 0 1 1 0 1 0 0 0
数据
df1 <- structure(list(panelist_id = c(1L, 1L, 1L, 2L, 3L, 3L), type = c("HP",
"HP", "PB", "PB", "TN", "HP"), refer_sm = c(1L, 0L, 0L, 0L, 1L,
1L), refer_se = c(0L, 1L, 1L, 1L, 0L, 0L)),
class = "data.frame", row.names = c(NA,
-6L))
使用reshape
+aggregate
的基本R选项
reshape(
aggregate(. ~ panelist_id + type, cbind(df, n = 1), sum),
idvar = "panelist_id",
timevar = "type",
direction = "wide"
)
给出
panelist_id refer_sm.HP refer_se.HP n.HP refer_sm.PB refer_se.PB n.PB
1 1 1 1 2 0 1 1
2 3 1 0 1 NA NA NA
4 2 NA NA NA 0 1 1
refer_sm.TN refer_se.TN n.TN
1 NA NA NA
2 1 0 1
4 NA NA NA
或者,我们可以使用xtabs
cbind(
xtabs(~ panelist_id + type, df),
xtabs(refer_sm ~ ., aggregate(refer_sm ~ panelist_id + type, df, sum)),
xtabs(refer_se ~ ., aggregate(refer_se ~ panelist_id + type, df, sum))
)
给出
HP PB TN HP PB TN HP PB TN
1 2 1 0 1 0 0 1 1 0
2 0 1 0 0 0 0 0 1 0
3 1 0 1 1 0 1 0 0 0