我有一个列表列,我想为每个组使用c()
,以将这些列表组合在summarize
中。这应该导致每组一行,但事实并非如此(注意代码是使用dplyr>= 1.1.0编写的):
library(dplyr)
df <- tibble::tibble(group = c("A", "A", "B"),
list_col = list(list("One"), list("Two"), list("Three")))
df |>
summarize(list_col = c(list_col),
.by = group)
这回报:
group list_col
<chr> <list>
1 A <list [1]>
2 A <list [1]>
3 B <list [1]>
Warning message:
Returning more (or less) than 1 row per `summarise()` group was deprecated in dplyr 1.1.0.
i Please use `reframe()` instead.
i When switching from `summarise()` to `reframe()`, remember that `reframe()` always
returns an ungrouped data frame and adjust accordingly.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
预期输出
output <- tibble::tibble(group = c("A", "B"),
list_col = list(list("One", "Two"), list("Three")))
group list_col
<chr> <list>
1 A <list [2]>
2 B <list [1]>
output$list_col[[1]]
[[1]]
[1] "One"
[[2]]
[1] "Two"
替代解决方案
您可以执行如下代码。然而,A)它改变了列的逐行类型和B)我想具体知道为什么c()
不起作用:
df |>
summarize(list_col = list(unlist(list_col)),
.by = group)
group list_col
<chr> <list>
1 A <chr [2]>
2 B <chr [1]>
在第一组(A
)中,我预计会发生以下事情,将两个列表合并为一个列表:
c(list("One"), list("Two"))
[[1]]
[1] "One"
[[2]]
[1] "Two"
那么,为什么这不起作用呢?这是一个错误,还是有什么与语法我失踪?
library(dplyr)
out <- df %>%
reframe(list_col = list(as.list(unlist(list_col))), .by = group)
与产出
> out
# A tibble: 2 × 2
group list_col
<chr> <list>
1 A <list [2]>
2 B <list [1]>
> out$list_col[[1]]
[[1]]
[1] "One"
[[2]]
[1] "Two"
op的预期
> output$list_col[[1]]
[[1]]
[1] "One"
[[2]]
[1] "Two"
考虑到c
和unlist
的不同,recursive
的默认参数为FALSE/TRUE
c(…, recursive = FALSE, use.names = TRUE)
unlist(x, recursive = TRUE, use.names = TRUE)
ie。基本的区别是
> c(list("a"))
[[1]]
[1] "a"
> unlist(list("a"))
[1] "a"
对于多于两个元素的list
,...
可变参数长度为1,因为它是传递给c
的单个列表。
> c(list("a", "b"))
[[1]]
[1] "a"
[[2]]
[1] "b"
c
不做任何事情,除非我们将它与do.call
一起使用,其中list
的每个元素都作为单独的参数传递
> do.call(c, list("a", "b"))
[1] "a" "b"
用OP的例子
> df$list_col[1:2]
[[1]]
[[1]][[1]]
[1] "One"
[[2]]
[[2]][[1]]
[1] "Two"
> c(df$list_col[1:2])
[[1]]
[[1]][[1]]
[1] "One"
[[2]]
[[2]][[1]]
[1] "Two"
> do.call(c, df$list_col[1:2])
[[1]]
[1] "One"
[[2]]
[1] "Two"
。如果我们这样做
out2 <- df %>%
reframe(list_col = list(do.call(c, list_col)), .by = group)
与产出
> out2$list_col[[1]]
[[1]]
[1] "One"
[[2]]
[1] "Two"