r-如何将一些受试者选择多个答案的daatset转换为伪变量格式



我有这个示例数据集

df <- data.frame(subjects = 1:12,
Why_are_you_not_happy = 
c(1,2,"1,2,5",5,1,2,"3,4",3,2,"1,5",3,4),
why_are_you_sad = 
c("1,2,3",1,2,3,"4,5,3",2,1,4,3,1,1,1) )

并希望将其转换为伪变量格式(基于每个问题的5个答案(。有人能指导我走一条有效的路吗?谢谢

您可以separate_rows进行多项选择,通过subjects转换为dummy和summarise(为每个受试者获得一行,以及他们的所有选择(。

library(fastDummies)
library(tidyr)
library(dplyr)
df %>% 
separate_rows(Why_are_you_not_happy, why_are_you_sad) %>% 
dummy_cols(c("Why_are_you_not_happy", "why_are_you_sad"),
remove_selected_columns = TRUE) %>% 
group_by(subjects) %>% 
summarise(across(everything(), max))

输出

# A tibble: 12 × 11
subjects Why_are_you…¹ Why_a…² Why_a…³ Why_a…⁴ Why_a…⁵ why_a…⁶ why_a…⁷ why_a…⁸ why_a…⁹ why_a…˟
<int>         <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>
1        1             1       0       0       0       0       1       1       1       0       0
2        2             0       1       0       0       0       1       0       0       0       0
3        3             1       1       0       0       1       0       1       0       0       0
4        4             0       0       0       0       1       0       0       1       0       0
5        5             1       0       0       0       0       0       0       1       1       1
6        6             0       1       0       0       0       0       1       0       0       0
7        7             0       0       1       1       0       1       0       0       0       0
8        8             0       0       1       0       0       0       0       0       1       0
9        9             0       1       0       0       0       0       0       1       0       0
10       10             1       0       0       0       1       1       0       0       0       0
11       11             0       0       1       0       0       1       0       0       0       0
12       12             0       0       0       1       0       1       0       0       0       0

相关内容

最新更新