我有两个data.table dt和dt1,看起来像:
> dt <- data.table(grp = c("A", "A", "B", "B", "C"),
cat = c("01", "02", "01", "02", "01"),
Value = c(234, 234, 235, 536, 235))
> dt
grp cat Value
1: A 01 234
2: A 02 234
3: B 01 235
4: B 02 536
5: C 01 235
> dt1 <- data.table(grp = c("A","A","A","A","A","A","B","B","B", "B","C"),
cat = c("01","01","02","02","03","04", "01","01", "02", "03","01"),
rec = c(5435,4341, 32525,436,7087,467,523,245,568,24,789),
val = c(346,6876,436,6807,465,65875,6432,754,326532,746,578))
> dt1
grp cat rec val
1: A 01 5435 346
2: A 01 4341 6876
3: A 02 32525 436
4: A 02 436 6807
5: A 03 7087 465
6: A 04 467 65875
7: B 01 523 6432
8: B 01 245 754
9: B 02 568 326532
10: B 03 24 746
11: C 01 789 578
我想从表 dt1
中删除我没有相应cat
的记录,并在dt
中grp
。
例如,对于 grp
A,我在 dt 中没有与 cat
03 和 04 关联的记录,所以我想在 dt1 中删除它们。
我的最终表dt1
必须看起来像
> dt1
grp cat rec val
1: A 01 5435 346
2: A 01 4341 6876
3: A 02 32525 436
4: A 02 436 6807
5: B 01 523 6432
6: B 01 245 754
7: B 02 568 326532
8: C 01 789 578
如何在 R 中使用 data.table 执行此操作
我们可以做
dt1[dt[, -3], on = .(grp, cat)]
# grp cat rec val
#1: A 01 5435 346
#2: A 01 4341 6876
#3: A 02 32525 436
#4: A 02 436 6807
#5: B 01 523 6432
#6: B 01 245 754
#7: B 02 568 326532
#8: C 01 789 578