r-error:使用data.table索引时未使用的参数(nomatch=0)



我正在尝试使用data.table索引来执行快速查找。

table = c("AX-11415458", "AX-11417054", "AX-11419082", "AX-11421703", 
"AX-11422856", "AX-11422870")
df1 = structure(list(V1 = c(26L, 26L, 26L, 26L, 26L, 26L), V2 = c("AX-11415458", 
"AX-11417054", "AX-11419082", "AX-11421703", "AX-11422856", "AX-11422870"
), V3 = c(0L, 0L, 0L, 0L, 0L, 0L), V4 = c(705L, 3973L, 2859L, 
1683L, 6482L, 11930L), V5 = c("C", "G", "C", "A", "C", "G"), 
V6 = c("A", "A", "T", "G", "T", "T")), row.names = c(NA, 
-6L), class = "data.frame")
df2=structure(list(V1 = c("MT", "MT", "MT", "MT", "MT", "MT"), V2 = c("AX-11415458", 
"AX-11417054", "AX-11419082", "AX-11421703", "AX-11422856", "AX-11422870"
), V3 = c(0L, 0L, 0L, 0L, 0L, 0L), V4 = c(705L, 3973L, 2859L, 
1683L, 6482L, 11930L), V5 = c(".", ".", ".", ".", ".", "."), 
V6 = c("A", "A", "T", "G", "T", "T")), row.names = c(NA, 
-6L), class = "data.frame")
setDT(df1)
setDT(df2)
setkey(df1, V2)
setkey(df2, V2)

我想遍历表,查找df1和df2中的值,并用df1中的V5和V6替换df2中。

for (i in table) {
df2[.(i), nomatch = 0L][,5:6] = df1[.(i), nomatch = 0L][,5:6]
}

但我得到了错误:

[<-.data.table(*tmp*,.(i(,nomatch=0L,value=list(V1=";MT"未使用的参数(nomatch=0(

为什么我不能这样做,有正确的方法来做我想做的事情吗?

事实上,您的可以直接更正为

for (i in table) {
df2[i,5:6] <- df1[i,5:6]
}

nomatch = 0L仅用于内部联接,链[,5:6]不会更新原始df中的数据。

此外,你也可以尝试这种方法

setDT(df1)
setDT(df2)
df3 <- df1[V2 %chin% table]
setkey(df2,V2)
setkey(df3,V2)
df2[,`:=`(
V5=fcoalesce(df3[df2,V5]),
V6=fcoalesce(df3[df2,V6])
)
]

结果

> df2
V1          V2 V3    V4 V5 V6
1: MT AX-11415458  0   705  C  A
2: MT AX-11417054  0  3973  G  A
3: MT AX-11419082  0  2859  C  T
4: MT AX-11421703  0  1683  A  G
5: MT AX-11422856  0  6482  C  T
6: MT AX-11422870  0 11930  G  T

最新更新