我有一个包含100列不同字符串的数据框架。
的例子:
id Disease1 Disease2 Disease3
01 disease1 NA disease3
02 NA disease2 NA
03 disease1 disease2 NA
我如何将多个列中的不同值转换为1,然后突变一个新列,计算特定列(可能在列22:65)中1的总数,甚至更好的starts_with()
。
所需输出
id Disease1 Disease2 Disease3 Total_diseases
01 1 NA 1 2
02 NA 1 NA 1
03 1 1 NA 2
df <- read.table(textConnection('id Disease1 Disease2 Disease3
01 disease1 NA disease3
02 NA disease2 NA
03 disease1 disease2 NA'),header=T)
library(dplyr)
df %>%
rowwise() %>%
mutate(Total_diseases=sum(!is.na(across(Disease1:Disease3)))) %>%
ungroup
用accross
函数检查从疾病e1到疾病e3。
输出;
id Disease1 Disease2 Disease3 Total_diseases
<int> <fct> <fct> <fct> <int>
1 1 disease1 NA disease3 2
2 2 NA disease2 NA 1
3 3 disease1 disease2 NA 2
使用across
将非空值更改为1,使用rowSums
将它们逐行求和。
library(dplyr)
df %>%
mutate(across(starts_with('Disease'), ~+(. != ''))) %>%
mutate(Total_disease = rowSums(select(.,starts_with('Disease')), na.rm = TRUE))
# id Disease1 Disease2 Disease3 Total_disease
#1 1 1 NA 1 2
#2 2 NA 1 NA 1
#3 3 1 1 NA 2