r语言 - 用1替换不同列中的多个不同字符串,并用dplyr中的计数改变一个新列



我有一个包含100列不同字符串的数据框架。

的例子:

id  Disease1 Disease2 Disease3
01  disease1  NA      disease3
02  NA       disease2 NA
03  disease1 disease2 NA

我如何将多个列中的不同值转换为1,然后突变一个新列,计算特定列(可能在列22:65)中1的总数,甚至更好的starts_with()

所需输出

id  Disease1 Disease2 Disease3 Total_diseases
01  1        NA       1        2
02  NA       1        NA       1
03  1        1        NA       2
df <- read.table(textConnection('id  Disease1 Disease2 Disease3
01  disease1  NA      disease3
02  NA       disease2 NA
03  disease1 disease2 NA'),header=T)
library(dplyr)
df %>%
rowwise() %>%
mutate(Total_diseases=sum(!is.na(across(Disease1:Disease3)))) %>%
ungroup

accross函数检查从疾病e1到疾病e3。

输出;

id Disease1 Disease2 Disease3 Total_diseases
<int> <fct>    <fct>    <fct>             <int>
1     1 disease1 NA       disease3              2
2     2 NA       disease2 NA                    1
3     3 disease1 disease2 NA                    2

使用across将非空值更改为1,使用rowSums将它们逐行求和。

library(dplyr)
df %>%
mutate(across(starts_with('Disease'), ~+(. != ''))) %>%
mutate(Total_disease = rowSums(select(.,starts_with('Disease')), na.rm = TRUE))
#  id Disease1 Disease2 Disease3 Total_disease
#1  1        1       NA        1             2
#2  2       NA        1       NA             1
#3  3        1        1       NA             2

最新更新