R-如何将这个嵌套的for循环转换为一个可以改变列表的lapply函数



我的数据看起来像这个

aList <- list(a1 = c("apple", "banana", "orange", "strawberry", "cherry"),
a2 = c("banana", "cherry", "apple"),
a3 = c("apple", "strawberry", "pineapple"),
a4 = c("raspberry", "strawberry", "apple"),
a5 = c("pineapple", "lemon", "orange", "banana", "apple"),
a6 = c("lemon", "apple", "blueberry"),
a7 = c("watermelon", "apple", "banana", "mango"),
a8 = c("mango", "cherry", "apple", "lemon"),
a9 = c("orange", "banana", "strawberry"),
a10 = c("mango", "strawberry"))

我想把它变成一个垂直格式,就像你运行这个代码时会发生的那样:

vertical_data <- list()
for (x in names(aList)) {
for (y in aList[[x]]) {
if (is.null(vertical_data[[y]])) {
vertical_data[[y]] <- x
} else {
vertical_data[[y]] <- c(x, vertical_data[[y]])
}
}
}
vertical_data

我希望每个条目都能告诉我特定水果的产地。

这对于一个双for循环来说已经足够容易了。但是,当我对嵌套的lapply函数做同样的事情时,它看起来根本没有修改列表(即vertical_data(。为什么?我之所以想用application函数来做这件事,是因为它更快。我的实际数据集将有数千个项目和"水果"。for循环将花费太长时间。

我真的很感激你的帮助。

感谢

我们可以在unlisted数据上使用split

split(rep(names(aList), lengths(aList)), unlist(aList))

或者另一个选项是将stack转换为两列的"data.frame",然后执行split

with(stack(aList), split(as.character(ind), values))
#$apple
#[1] "a1" "a2" "a3" "a4" "a5" "a6" "a7" "a8"
#$banana
#[1] "a1" "a2" "a5" "a7" "a9"
#$blueberry
#[1] "a6"
#$cherry
#[1] "a1" "a2" "a8"
#$lemon
#[1] "a5" "a6" "a8"
#$mango
#[1] "a7"  "a8"  "a10"
#$orange
#[1] "a1" "a5" "a9"
#$pineapple
#[1] "a3" "a5"
#$raspberry
#[1] "a4"
#$strawberry
#[1] "a1"  "a3"  "a4"  "a9"  "a10"
#$watermelon
#[1] "a7"

或如@rawr所述

unstack(stack(aList)[2:1])

关于lapplyfor循环内的分配,它是基于环境的。在for循环中,赋值修改全局env中的对象,但在lapply中,它是一个自包含的env,否则必须执行<<-(不可取(或将env指定为全局env

vertical_data <- list()
lapply(names(aList), function(x) lapply(aList[[x]], 
function(y) if (is.null(vertical_data[[y]])) {
vertical_data[[y]] <<- x
} else {vertical_data[[y]] <<- c(x, vertical_data[[y]])
}))

我们可以使用enframe将名称列表转换为数据帧,然后基于value拆分name

tibble::enframe(aList) %>% tidyr::unnest(value) %>% {split(.$name, .$value)}
#$apple
#[1] "a1" "a2" "a3" "a4" "a5" "a6" "a7" "a8"
#$banana
#[1] "a1" "a2" "a5" "a7" "a9"
#$blueberry
#[1] "a6"
#$cherry
#[1] "a1" "a2" "a8"
#$lemon
#[1] "a5" "a6" "a8"
#$mango
#[1] "a7"  "a8"  "a10"
#$orange
#[1] "a1" "a5" "a9"
#$pineapple
#[1] "a3" "a5"
#$raspberry
#[1] "a4"
#$strawberry
#[1] "a1"  "a3"  "a4"  "a9"  "a10"
#$watermelon
#[1] "a7"

最新更新