r-字典已满!使用dplyr的错误消息



你好,我真的要在字典上做一些事情

这是一个标题:

V1 V2 V3  scaf_name
1: scaffold_0  1  1 scaffold_0
2: scaffold_0  2  1 scaffold_0
3: scaffold_0  3  1 scaffold_0
4: scaffold_0  4  1 scaffold_0
5: scaffold_0  5  1 scaffold_0
6: scaffold_0  6  1 scaffold_0

这是我试过的代码:

tab3<-tab %>% 
group_by(scaf_name) %>%  
summarise(Avg_group=mean(V3),Length=last(V2))

这是我收到的错误信息

Error: Internal error: Dictionary is full!

这是标签的尺寸

> dim(tab)
[1] 852355422         4

所以使用dplyr似乎数据帧太大了,有人知道我该如何克服这个问题吗?

非常感谢

这是df 的一小部分

> dput(tab_bis)
structure(list(V1 = c("scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0"), V2 = 1:30, V3 = c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), scaf_name = c("scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0", 
"scaffold_0", "scaffold_0", "scaffold_0", "scaffold_0")), row.names = c(NA, 
-30L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x556f4666b340>)

这是一个tidyverse已知的问题。https://github.com/r-lib/vctrs/issues/1133您绕过了某个值的限制。他们必须修复它。... uint32_t. I thought about just making sure that we store this instead as a uint64_t ...举个例子https://github.com/tidyverse/dplyr/issues/5291

我的解决方案是使用data.table.

library(data.table)
dt = data.table(tab)
dt[,.(Avg_group=mean(V3),Length=last(V2)),by = .(scaf_name)]

使用数据表,它更适合处理大块数据

相关内容

最新更新