r语言 - 根据另一个表中的记录筛选一个表中的记录



我有两个data.table dt和dt1,看起来像:

> dt <- data.table(grp = c("A", "A",  "B", "B", "C"),
                   cat = c("01", "02", "01", "02", "01"),
                  Value = c(234, 234, 235, 536, 235))
> dt
   grp cat Value
1:   A  01   234
2:   A  02   234
3:   B  01   235
4:   B  02   536
5:   C  01   235
> dt1 <- data.table(grp = c("A","A","A","A","A","A","B","B","B", "B","C"),
                   cat = c("01","01","02","02","03","04", "01","01", "02", "03","01"),
                   rec = c(5435,4341, 32525,436,7087,467,523,245,568,24,789),
                   val = c(346,6876,436,6807,465,65875,6432,754,326532,746,578))
> dt1
    grp cat   rec    val
 1:   A  01  5435    346
 2:   A  01  4341   6876
 3:   A  02 32525    436
 4:   A  02   436   6807
 5:   A  03  7087    465
 6:   A  04   467  65875
 7:   B  01   523   6432
 8:   B  01   245    754
 9:   B  02   568 326532
10:   B  03    24    746
11:   C  01   789    578

我想从表 dt1 中删除我没有相应cat的记录,并在dtgrp

例如,对于 grp A,我在 dt 中没有与 cat 03 和 04 关联的记录,所以我想在 dt1 中删除它们。

我的最终表dt1必须看起来像

> dt1
    grp cat   rec    val
 1:   A  01  5435    346
 2:   A  01  4341   6876
 3:   A  02 32525    436
 4:   A  02   436   6807
 5:   B  01   523   6432
 6:   B  01   245    754
 7:   B  02   568 326532
 8:   C  01   789    578

如何在 R 中使用 data.table 执行此操作

我们可以做

dt1[dt[, -3], on = .(grp, cat)]
#    grp cat   rec    val
#1:   A  01  5435    346
#2:   A  01  4341   6876
#3:   A  02 32525    436
#4:   A  02   436   6807
#5:   B  01   523   6432
#6:   B  01   245    754
#7:   B  02   568 326532
#8:   C  01   789    578

最新更新