r-在dtplyr与dplyr中选择并分组多个列

  • 本文关键字:选择 dtplyr dplyr r dplyr dtplyr
  • 更新时间 :
  • 英文 :


我想在lapply循环中group_byacrossdtplyr中的几个变量,但我发现在调用lazy_dt()后,我无法使用与dplyr相同的语法。

library(dplyr)
mycolumns= c("Wind", "Month", "Ozone", "Solar.R")
columnpairs <- as.data.frame(combn(mycolumns, 2))
#         V1    V2      V3    V4      V5      V6
#    1  Wind  Wind    Wind Month   Month   Ozone
#    2 Month Ozone Solar.R Ozone Solar.R Solar.R
result_dplyr <- lapply(columnpairs, function(x) {
airquality %>% 
select(all_of(x)) %>% 
group_by(across(all_of(x))) %>% filter(n() > 1)
}
)
$V1
# A tibble: 105 x 2
# Groups:   Wind, Month [40]
Wind Month
<dbl> <int>
1   7.4     5
2   8       5
3  11.5     5
4  14.9     5
5   8.6     5
6   8.6     5
7   9.7     5
8  11.5     5
9  12       5
10  11.5     5
# ... with 95 more rows

使用相同的语法,在用dtplyr调用lazy_dt后,我遇到了一个问题。

library(dtplyr)
airq <- lazy_dt(airquality)
lapply(columnpairs, function(x) {
airq %>% select(all_of(x)) %>% 
group_by(across(all_of(x))) %>% filter(n() > 1)
})
Error in `all_of()`:
! object 'x' not found

知道吗?

编辑:问题创建于https://github.com/tidyverse/dtplyr/issues/383

似乎group_bydtplyr(group_by.dtplyr_step(的方法正在创建问题。

> methods('group_by')
[1] group_by.data.frame*  group_by.data.table*  group_by.dtplyr_step*

不确定它是否是一个bug。

> traceback()
...
6: group_by.dtplyr_step(., across(all_of(.x)))  ###
5: group_by(., across(all_of(.x)))
4: filter(., n() > 1)
3: airq %>% select(all_of(.x)) %>% group_by(across(all_of(.x))) %>% 
filter(n() > 1)
2: .f(.x[[i]], ...)
1: map(columnpairs, ~airq %>% select(all_of(.x)) %>% group_by(across(all_of(.x))) %>% 
filter(n() > 1))

以下是的两种工作方法

  1. 使用不推荐使用的group_by_at
  2. 转换为syms,然后求值(!!!(
使用group_by_at
library(dtplyr)
library(purrr)
library(dplyr)
map(columnpairs, ~ airq %>%
select(all_of(.x)) %>%
group_by_at(all_of(.x)) %>%
filter(n() > 1))
$V1
Source: local data table [105 x 2]
Groups: Wind, Month
Call:
_DT2 <- `_DT1`[, .(Wind, Month)]
`_DT2`[`_DT2`[, .I[.N > 1], by = .(Wind, Month)]$V1]
Wind Month
<dbl> <int>
1   7.4     5
2   7.4     5
3   8       5
4   8       5
5  11.5     5
6  11.5     5
# … with 99 more rows
...
转换为符号并求值
map(columnpairs, ~ airq %>% 
select(all_of(.x)) %>%
group_by(!!! rlang::syms(.x)) %>% 
filter(n() > 1))
$V1
Source: local data table [105 x 2]
Groups: Wind, Month
Call:
_DT20 <- `_DT1`[, .(Wind, Month)]
`_DT20`[`_DT20`[, .I[.N > 1], by = .(Wind, Month)]$V1]
Wind Month
<dbl> <int>
1   7.4     5
2   7.4     5
3   8       5
4   8       5
5  11.5     5
6  11.5     5
# … with 99 more rows
# Use as.data.table()/as.data.frame()/as_tibble() to access results
$V2
...

相关内容

  • 没有找到相关文章

最新更新