如何获得data.table格式的最终查询?
如果不可能:如何将查询从dplyr重写为data.table?
library(data.table)
library(dplyr)
library(dtplyr)
#Data
dt = structure(list(Year = c(2015L, 2016L, 2017L, 2015L, 2016L, 2017L),
Item = c("Soybeans", "Soybeans", "Soybeans", "Pulses, Total", "Pulses, Total", "Pulses, Total"),
Value = c(884688L, 829166L, 960640L, 2219455L, 2354696L, 2683772L)),
row.names = c(NA, -6L), class = "data.frame")
# query in dplyr
dt %>%
group_by(Year) %>%
summarise(Value = sum(Value), Item = "Total") %>%
bind_rows(., dt)
#conveert query to data.table
dtl=lazy_dt(dt)
dtl %>%
group_by(Year) %>%
summarise(Value = sum(Value), Item = "Total") %>%
bind_rows(., dtl) %>% show_query()
错误:参数1必须是数据帧或命名的原子向量。
使用data.table
可以将此操作写成-
library(data.table)
setDT(dt)
rbind(dt[, .(Value = sum(Value), Item = "Total"), Year], dt)
# Year Value Item
#1: 2015 3104143 Total
#2: 2016 3183862 Total
#3: 2017 3644412 Total
#4: 2015 884688 Soybeans
#5: 2016 829166 Soybeans
#6: 2017 960640 Soybeans
#7: 2015 2219455 Pulses, Total
#8: 2016 2354696 Pulses, Total
#9: 2017 2683772 Pulses, Total