如何将字符串从 df 列拆分为多列,然后将值分配给正确的变量。
在我的示例中,Q1 到 Q4 是变量名称,"中性"等是可能的答案。
我的问题主要在于可能的"NA">
A <- 'Q1:rnNeutralrnQ2:rnTotally DisagreernQ3:rnTotally Agree'
B <- 'Q1:rnNeutralrnQ2:rnNeutralrnQ3:rnNeutral'
C <- 'Q1:rnNeutralrnQ3:rnNeutral'
D <- ''
df <- as.data.frame(cbind(c(A,B,C,D)))
AllAnswers <- c('Neutral','Totally Disagree', 'Totally Agree', 'Neutral', 'Neutral', 'Neutral', 'Neutral', 'Neutral', '', '', '', '')
DesiredDf <- data.frame(matrix(AllAnswers, nrow = 4, ncol = 3, byrow = TRUE))
我建议:
separate(df,V1,c("a","b","c"),sep='rn(?=Q)') %>%
mutate(id=row_number()) %>%
gather(k,v,-id) %>%
separate(v,c("v1","v2"),":rn") %>%
select(-k) %>%
filter(!is.na(v2)) %>%
spread(v1,v2)
但是您的"DesiredDF"似乎存在问题,这是我的输出:
#id Q1 Q2 Q3
#1 1 Neutral Totally Disagree Totally Agree
#2 2 Neutral Neutral Neutral
#3 3 Neutral <NA> Neutral
在 'df' 中,第三行没有 Q2:
# V1
#1 Q1:rnNeutralrnQ2:rnTotally DisagreernQ3:rnTotally Agree
#2 Q1:rnNeutralrnQ2:rnNeutralrnQ3:rnNeutral
#3 Q1:rnNeutralrnQ3:rnNeutral
#4
或者保留具有空 V1 的行:
df1 <- df %>% mutate(id=row_number())
df1 %>% separate(V1,c("a","b","c"),sep='rn(?=Q)') %>%
gather(k,v,-id) %>%
separate(v,c("v1","v2"),":rn") %>%
select(-k) %>%
filter(!is.na(v2)) %>%
spread(v1,v2) %>%
right_join(df1 %>% select(id), by="id")
# id Q1 Q2 Q3
#1 1 Neutral Totally Disagree Totally Agree
#2 2 Neutral Neutral Neutral
#3 3 Neutral <NA> Neutral
#4 4 <NA> <NA> <NA>
基于Nicolas2的答案,以下解决方案需要更少的代码:
library(tidyverse)
df %>%
separate(V1,c("X1","X2","X3"),sep='rn(?=Q)') %>%
mutate_at(vars(X1:X3), funs(str_replace_all(., "[Q[:digit:][:punct:]]", "")))