将聚集在一个变量中的数据拆分为R中的多个变量



我对R有点陌生,如果我没有使用正确的行话,请原谅。我试图整理一个有一个变量的数据框架,将我想要拆分为不同变量的数据进行聚集。本质上,它看起来是这样的:

ID    Scores
01    Math: 5, Physics: 4, English: 3  
02    English: 5, Math: 3, Physics: 6.9  
03    Math: 3.75, Chemistry: 4, English: 3  
04    History: 8, Math: 2, Physics: 3

我希望它看起来像这样:

ID    Math     Chemistry    English     History     Physics         
01    5        NaN          3           NaN         4
02    3        NaN          5           NaN         6.9
03    3.75     4            3           NaN         NaN
04    2        NaN          NaN         8           3

非常感谢!

我建议使用带有一些tidyr函数的tidyverse方法。您可以先在行级别上分离变量Scores,然后在列级别上分离。最后,您可以重新整形以获得所需的输出。这里的代码:

library(tidyverse)
#Data
df <- structure(list(ID = c(1, 2, 3, 4), Scores = c("Math: 5, Physics: 4, English: 3", 
"English: 5, Math: 3, Physics: 6.9", "Math: 3.75, Chemistry: 4, English: 3", 
"History: 8, Math: 2, Physics: 3")), class = "data.frame", row.names = c(NA, 
-4L))

代码:

#Code
df %>% separate_rows(Scores,sep = ',') %>%
#Format
mutate(Scores=trimws(Scores)) %>%
#Separate again by :
separate(Scores,sep=':',into = c('Subject','Grade')) %>%
#Format
mutate(Subject=trimws(Subject),Grade=as.numeric(trimws(Grade))) %>%
pivot_wider(names_from = Subject,values_from=Grade)

输出:

# A tibble: 4 x 6
ID  Math Physics English Chemistry History
<dbl> <dbl>   <dbl>   <dbl>     <dbl>   <dbl>
1     1  5        4         3        NA      NA
2     2  3        6.9       5        NA      NA
3     3  3.75    NA         3         4      NA
4     4  2        3        NA        NA       8

最新更新