取字母表中第一个字母(R)



在以下数据帧df中,

structure(list(Name = c("Gregory", "Jane", "Joey", "Mark", "Rachel", "Phoebe", "Liza"), code = c("xx11-9090", "1367-88uu", "117y-xxxh", "cf56-gh67", "1888-ddf5", "rf52-628u", "hj69-5kk5"), `CLASS IF5` = c("E", "C", "C", "D", "D", "A", "A"), `CLASS AIS` = c("E", 
"C", "C", "D", "D", "A", "A"), `CLASS IPP` = c("C", "C", "C", 
"E", "E", "B", "A"), `CLASS SJR` = c("D", "C", "C", "D", "D", 
"B", "A")), row.names = c(1682L, 1683L, 1768L, 333L, 443L, 510L, 
897L), class = "data.frame")

字母表示排名。例如:A是第一个位置,B是第二个位置,依此类推。字母的范围在A和E之间。我想将以CLASS开头的列(即数据帧的最后四列(折叠在一列中,保留数据帧的每一行,只保留与排名中最高位置对应的字母。

期望的结果是:

Name      code new column 
1682 Gregory xx11-9090         C
1683    Jane 1367-88uu         C
1768    Joey 117y-xxxh         C
333     Mark cf56-gh67         D
443   Rachel 1888-ddf5         D
510   Phoebe rf52-628u         A
897     Liza hj69-5kk5         A

您可以使用apply语句将min函数应用于每一行,然后将其输出分配给一个新列:

df$new_column <- apply(df[, grep("^CLASS", names(df))], 1, min, na.rm = TRUE)

基R中的一个可能的解决方案:

df$new_coolumn <- apply(df, 1, (x) sort(x[-(1:2)])[1])
df[,c(1,2,7)]
#>         Name      code new_coolumn
#> 1682 Gregory xx11-9090           C
#> 1683    Jane 1367-88uu           C
#> 1768    Joey 117y-xxxh           C
#> 333     Mark cf56-gh67           D
#> 443   Rachel 1888-ddf5           D
#> 510   Phoebe rf52-628u           A
#> 897     Liza hj69-5kk5           A

使用dplyr:

library(dplyr)
df %>% 
rowwise %>% 
mutate(new_column = c_across(starts_with("CLASS")) %>% sort %>% .[1]) %>% 
select(Name, code, new_column) %>% ungroup
#> # A tibble: 7 × 3
#>   Name    code      new_column
#>   <chr>   <chr>     <chr>     
#> 1 Gregory xx11-9090 C         
#> 2 Jane    1367-88uu C         
#> 3 Joey    117y-xxxh C         
#> 4 Mark    cf56-gh67 D         
#> 5 Rachel  1888-ddf5 D         
#> 6 Phoebe  rf52-628u A         
#> 7 Liza    hj69-5kk5 A