在以下数据帧df
中,
structure(list(Name = c("Gregory", "Jane", "Joey", "Mark", "Rachel", "Phoebe", "Liza"), code = c("xx11-9090", "1367-88uu", "117y-xxxh", "cf56-gh67", "1888-ddf5", "rf52-628u", "hj69-5kk5"), `CLASS IF5` = c("E", "C", "C", "D", "D", "A", "A"), `CLASS AIS` = c("E",
"C", "C", "D", "D", "A", "A"), `CLASS IPP` = c("C", "C", "C",
"E", "E", "B", "A"), `CLASS SJR` = c("D", "C", "C", "D", "D",
"B", "A")), row.names = c(1682L, 1683L, 1768L, 333L, 443L, 510L,
897L), class = "data.frame")
字母表示排名。例如:A是第一个位置,B是第二个位置,依此类推。字母的范围在A和E之间。我想将以CLASS
开头的列(即数据帧的最后四列(折叠在一列中,保留数据帧的每一行,只保留与排名中最高位置对应的字母。
期望的结果是:
Name code new column
1682 Gregory xx11-9090 C
1683 Jane 1367-88uu C
1768 Joey 117y-xxxh C
333 Mark cf56-gh67 D
443 Rachel 1888-ddf5 D
510 Phoebe rf52-628u A
897 Liza hj69-5kk5 A
您可以使用apply
语句将min函数应用于每一行,然后将其输出分配给一个新列:
df$new_column <- apply(df[, grep("^CLASS", names(df))], 1, min, na.rm = TRUE)
基R中的一个可能的解决方案:
df$new_coolumn <- apply(df, 1, (x) sort(x[-(1:2)])[1])
df[,c(1,2,7)]
#> Name code new_coolumn
#> 1682 Gregory xx11-9090 C
#> 1683 Jane 1367-88uu C
#> 1768 Joey 117y-xxxh C
#> 333 Mark cf56-gh67 D
#> 443 Rachel 1888-ddf5 D
#> 510 Phoebe rf52-628u A
#> 897 Liza hj69-5kk5 A
使用dplyr
:
library(dplyr)
df %>%
rowwise %>%
mutate(new_column = c_across(starts_with("CLASS")) %>% sort %>% .[1]) %>%
select(Name, code, new_column) %>% ungroup
#> # A tibble: 7 × 3
#> Name code new_column
#> <chr> <chr> <chr>
#> 1 Gregory xx11-9090 C
#> 2 Jane 1367-88uu C
#> 3 Joey 117y-xxxh C
#> 4 Mark cf56-gh67 D
#> 5 Rachel 1888-ddf5 D
#> 6 Phoebe rf52-628u A
#> 7 Liza hj69-5kk5 A