r语言 - 按行比较两个数据表并添加新列



>我有两个数据表,我想每行比较并添加新列。

DT1 <- data.table(ID=c("F","A","E","B","C","D","C"),
                  num=c(59,3,108,11,22,54,241),
                  value=c(90,47,189,72,42,86,280))
DT2 <- data.table(Mark=c("Mary","Abner","Bonnie","Trista","Norman"),
                  numA=c(48,20,88,237,10),
                  numB=c(60,326,54,268,89),
                  valueA=c(78,34,78,270,60),
                  valueB=c(92,190,90,385,75))

我的目标:

我想在 DT1 中找到 num 和值,DT2 中有一系列 numA 和 numB。

例如:

对于 DT1 中的行 F num = 59 和值 = 90,还必须匹配:

num(59(> DT2$numA(48( & num(59( <DT2$numB(60(> DT2$

valueA(78( & value(90( <DT2$valueB(92(>

匹配! 所以添加新的列名结果,值是 dt2 标记

如果没有匹配项,请将其设置为"未定义">

期望的结果:

DT3 <- data.table(ID=c("F","A","E","B","C","D","C"),
              num=c(59,3,108,11,22,54,241),
              value=c(90,47,189,38,42,86,280),
              result=c("Mary","Undefined","Abner","Norman",
                       "Abner","Abner","Trista"))

如何确保每一行都有比较并添加新列?

data.table 选项:

DT1[DT2, on=.(num > numA, num < numB, value > valueA, value < valueB), Mark := i.Mark]
 DT1
   ID num value   Mark
1:  F  59    90  Abner
2:  A   3    47   <NA>
3:  E 108   189  Abner
4:  B  11    72 Norman
5:  C  22    42  Abner
6:  D  54    86  Abner
7:  C 241   280 Trista
我相信这

可以使用data.table中的连接操作之一更有效地解决,但是,这是一个使用 mapply 的基本 R 选项

DT1$result <- mapply(function(x, y) {
   inds <- x > DT2$numA & x < DT2$numB & y > DT2$valueA & x < DT2$valueB
   if(any(inds))
     DT2$Mark[which.max(inds)]
   else "Undefined"
}, DT1$num, DT1$value)

DT1
#   ID num value    result
#1:  F  59    90      Mary
#2:  A   3    47 Undefined
#3:  E 108   189     Abner
#4:  B  11    72    Norman
#5:  C  22    42     Abner
#6:  D  54    86      Mary
#7:  C 241   280    Trista

最新更新