我有一个像这样的数据集:
<表类>
ID
冬天春天夏季 秋季 tbody><<tr>1 高 NA 高 低 2低 高 NA 低 3 低 NA NA 低 4低 高 NA 低 表类>
我们可以用if_any
library(dplyr)
df1 <- df1 %>%
mutate(calculated_column = +(if_any(-ID, ~ . %in% 'high')))
与产出
df1
ID Winter Spring Summer Fall calculated_column
1 1 high <NA> high low 1
2 2 low high <NA> low 1
3 3 low <NA> <NA> low 0
4 4 low high <NA> low 1
或者如果我们想使用base R
,则在逻辑矩阵
rowSums
的逻辑条件df1$calculated_column <- +(rowSums(df1[-1] == "high", na.rm = TRUE) > 0)
数据df1 <- structure(list(ID = 1:4, Winter = c("high", "low", "low", "low"
), Spring = c(NA, "high", NA, "high"), Summer = c("high", NA,
NA, NA), Fall = c("low", "low", "low", "low")),
class = "data.frame", row.names = c(NA,
-4L))
你也可以这样做:
df1$calculated_column = +grepl('high', do.call(paste, df1))
df1
ID Winter Spring Summer Fall calculated_column
1 1 high <NA> high low 1
2 2 low high <NA> low 1
3 3 low <NA> <NA> low 0
4 4 low high <NA> low 1
这是base R
的解决方案:
calculated_column = (apply(df1,1,function(x) sum(grepl("high",x)))>0)*1
cbind(df1, calculated_column)
ID Winter Spring Summer Fall calculated_column
1 1 high <NA> high low 1
2 2 low high <NA> low 1
3 3 low <NA> <NA> low 0
4 4 low high <NA> low 1