根据r中的逐行数据创建新列



这是一个最小的可重复的例子,但我的实际数据非常大,所以我不能手动完成

id<-1:4
mathpass_1<-c("pass","fail","pass","fail")
mathpass_2<-c("fail","fail","fail","fail")
mathpass_3<-c("fail","fail","pass","pass")
mathpass_4<-c("fail","fail","pass","fail")
math<-data.frame(id,mathpass_1,mathpass_2,mathpass_3,mathpass_4)

所以,数据是这样的

> math
id mathpass_1 mathpass_2 mathpass_3 mathpass_4
1  1       pass       fail       fail       fail
2  2       fail       fail       fail       fail
3  3       pass       fail       pass       pass
4  4       fail       fail       pass       fail

Id为学生Id。我想再加一列(二元变量)如果一个学生至少通过了一门考试,他们将获得"及格"。如果一个学生没有通过考试,他们就会被称为"fail"。

所以,我想创建一个列"pass"像这样,但是我手工制作的

id<-1:4
mathpass_1<-c("pass","fail","pass","fail")
mathpass_2<-c("fail","fail","fail","fail")
mathpass_3<-c("fail","fail","pass","pass")
mathpass_4<-c("fail","fail","pass","fail")
pass<-c("pass","fail","pass","pass")
math<-data.frame(id,mathpass_1,mathpass_2,mathpass_3,mathpass_4,pass)

> math
id mathpass_1 mathpass_2 mathpass_3 mathpass_4 pass
1  1       pass       fail       fail       fail pass
2  2       fail       fail       fail       fail fail
3  3       pass       fail       pass       pass pass
4  4       fail       fail       pass       fail pass

然而,我的实际数据非常大,我无法手动处理。我如何用代码做到这一点?(非体力劳动的方式)

试试ifelse(rowSums(math[,-1]=="pass")>0,"pass","fail").

你可以试试

library(dplyr)
math %>%
mutate(pass = ifelse(rowSums(across(is.character, ~.x == "pass")) >0, "pass", "fail"))
id mathpass_1 mathpass_2 mathpass_3 mathpass_4 pass
1  1       pass       fail       fail       fail pass
2  2       fail       fail       fail       fail fail
3  3       pass       fail       pass       pass pass
4  4       fail       fail       pass       fail pass
id<-1:4
mathpass_1<-c("pass","fail","pass","fail")
mathpass_2<-c("fail","fail","fail","fail")
mathpass_3<-c("fail","fail","pass","pass")
mathpass_4<-c("fail","fail","pass","fail")
pass<-c("pass","fail","pass","pass")
df<-data.frame(id,mathpass_1,mathpass_2,mathpass_3,mathpass_4,pass)
library(tidyverse)
df %>%
rowwise() %>%
mutate(res = max(c_across(starts_with("mathpass_"))))
#> # A tibble: 4 × 7
#> # Rowwise: 
#>      id mathpass_1 mathpass_2 mathpass_3 mathpass_4 pass  res  
#>   <int> <chr>      <chr>      <chr>      <chr>      <chr> <chr>
#> 1     1 pass       fail       fail       fail       pass  pass 
#> 2     2 fail       fail       fail       fail       fail  fail 
#> 3     3 pass       fail       pass       pass       pass  pass 
#> 4     4 fail       fail       pass       fail       pass  pass

由reprex包(v2.0.1)创建于2022-06-09

另一个可能的解决方案,在base R:

math$pass <- apply(math, 1, (x) if (any(x[-1] == "pass")) "pass" else "fail")
math
#>   id mathpass_1 mathpass_2 mathpass_3 mathpass_4 pass
#> 1  1       pass       fail       fail       fail pass
#> 2  2       fail       fail       fail       fail fail
#> 3  3       pass       fail       pass       pass pass
#> 4  4       fail       fail       pass       fail pass

最新更新