假设我有以下向量
test <- c("x1" = 0.1, "x2" = 0.3, "x3" = 0.4,
"y1" = 0.1, "y2" = 0.5, "y3" = 0.4,
"z1" = 0.5, "z2" = 0.3, "z3" = 0.4)
test
# x1 x2 x3 y1 y2 y3 z1 z2 z3
# 0.1 0.3 0.4 0.1 0.5 0.4 0.5 0.3 0.4
我想找到值最高的向量元素,按字母分组。在这种情况下,我希望输出是"x3", "y2", "z1"。棘手的是,我事先不知道有多少个不同的字母组,也不知道每个字母有多少个数字。因此,我需要一个简单而灵活的代码,它不需要预先指定的分组。
对于使用哪些函数有什么建议吗?
我的解决方案是详细介绍。
## can also use `grp <- stringr::str_remove(names(test), "[0-9]+")`
grp <- stringr::str_extract(names(test), "[A-Za-z]+")
#[1] "x" "x" "x" "y" "y" "y" "z" "z" "z"
## split vector by group
lst <- unname(split(test, grp))
#[[1]]
# x1 x2 x3
#0.1 0.3 0.4
#
#[[2]]
# y1 y2 y3
#0.1 0.5 0.4
#
#[[3]]
# z1 z2 z3
#0.5 0.3 0.4
## since you want to keep the names "x3", "y2", "z1"
## it is not satisfactory to simply do `sapply(lst, max)`
sapply(lst, function (x) x[which.max(x)])
# x3 y2 z1
#0.4 0.5 0.5
代码足够健壮,可以处理以下更复杂的情况。
hard <- c("x3" = 0.1, "x2" = 0.3, "x1" = 0.4,
"Yy1" = 0.1, "Yy2" = 0.5, "Yy3" = 0.4,
"z0" = 0.5, "z1" = 0.3, "z2" = 0.4)
# x3 x2 x1 Yy1 Yy2 Yy3 z0 z1 z2
#0.1 0.3 0.4 0.1 0.5 0.4 0.5 0.3 0.4
grp <- stringr::str_extract(names(hard), "[A-Za-z]+")
lst <- unname(split(hard, grp))
sapply(lst, function (x) x[which.max(x)])
# x1 Yy2 z0
#0.4 0.5 0.5