r-检查数据帧中的值是否在另一个数据帧的两列指定的值范围之间



所以,我有两个dataframes——有点大(df1~=20k行&df2~=150万(——我想检查df1中的值是否在df2$low & df2$high之间,但有条件地执行(以限制检查次数(,并且只在abs(df1$val-df2$val) < 2时执行检查。如果发现df1中的值在df2范围内,则将其添加到具有TRUE/FALSE值的新列中。

df1

weight
94.99610 95.00561

使用与data.table的非等联接-将第一个数据转换为data.table(setDT(,创建filter列作为逻辑(FALSE(值。进行非等联接,并将(:=(filter分配给TRUE,仅当条件(abs(weight - th_weight) < 2(满足时,才将FALSE更改为TRUE

library(data.table)
setDT(df1)[, filter := FALSE]
df1[df2, filter := abs(weight - th_weight) < 2, 
on = .(low <= th_weight, high >= th_weight)]

-输出

> df1
weight      low     high filter
<num>    <num>    <num> <lgcl>
1: 94.99610 94.99608 94.99613   TRUE
2: 95.00561 95.00558 95.00566  FALSE

数据

df1 <- structure(list(weight = c(94.9961, 95.00561), low = c(94.99608, 
95.00558), high = c(94.99613, 95.00566)), class = "data.frame", row.names = c(NA, 
-2L))
df2 <- structure(list(index = 1:5, th_weight = c(94.996092, 95.496336, 
95.509906, 97.473292, 100.51906)), class = "data.frame", row.names = c(NA, 
-5L))

最新更新