我有一个5列的数据框架,但我对其中一列" condition "感兴趣。条件在列,我需要找到一种方法来计算单元格中特定条目的数量。每个列单元格可以有一个或多个条目,以(,)分隔. 所以我的数据帧看起来像
S.NO Conditions
11 Eye Color
12 Sound of your voice
13 Certain disease,Size of a palm,Eye Color
16 Eye Color,Hair color
17 Hair color,Height
18 Sound of your voice,Height
我想count所有不同的条目/string在一次。总共我有35个不同的字符串列表在条件列,我希望我的输出是这样的
Eye color Sound of your voice Certain disease Size of a palm Hair color Height
3 2 1 1 2 2
由于我不知道数据的确切结构,所以我假设数据如下
数据data <- tribble(
~Conditions, ~value,
'Eye color', '3',
'Sound of your voice', '2',
'Certain disease, Size of a palm, Eye color', '1,1,2',
'Eye color, Hair color', '2,2',
'Hair color, Height', '3,1',
'Sound of your voice, Height', '1,4'
)
对于上述数据,我们可以编写以下代码来获得预期的结果
library(tidyverse)
Conditions <- unlist(strsplit(data$Conditions,','))
value <- unlist(strsplit(data$value,','))
df <- bind_cols(Conditions,value) %>% setNames(c('Conditions', 'value')) %>%
mutate(across(everything(), ~trimws(.x)), value=as.numeric(value)) %>%
arrange(Conditions) %>% group_by(Conditions) %>% slice_head(n=1) %>%
mutate(row=row_number()) %>%
pivot_wider(names_from = Conditions, values_from =value)
输出# A tibble: 1 × 7
row `Certain disease` `Eye color` `Hair color` Height `Size of a palm` `Sound of your voice`
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 3 2 1 1 2
library(tidyverse)
Conditions <- unlist(strsplit(data$Conditions,','))
value <- unlist(strsplit(data$value,','))
df <- bind_cols(Conditions,value) %>% setNames(c('Conditions', 'value')) %>%
mutate(across(everything(), ~trimws(.x)), value=as.numeric(value)) %>%
arrange(Conditions) %>% group_by(Conditions) %>% slice_head(n=1) %>%
mutate(row=row_number()) %>%
pivot_wider(names_from = Conditions, values_from =value)
# A tibble: 1 × 7
row `Certain disease` `Eye color` `Hair color` Height `Size of a palm` `Sound of your voice`
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 3 2 1 1 2