使用R中最接近的较低值合并两个数据帧

我从以下两个数据帧中提取了部分信息：

df1 <- data.frame(sample=c(1, 2, 3, 4, 5),
RT=c(3.88, 4.52, 32.82, 15.71, 20.33),
Hit=c(2, 1, 7, 1, 5))

和

df2 <- data.frame(rt_stand=c(4.5, 8.5, 15.8, 23.2, 35.0),
n_carb=c(10, 11, 12, 13, 14),
below=c(5.5, 6.8, 8.2, 10.0, 12.3))

我希望df2中的行根据最近但较低的值连接df1中的行。

输出将是这样的数据帧：

RT_stand<1th>n_carb<2th>NA<1>>>

样本	RT	命中
1	3.88	2	NA
2	4.52	4.5	10	5.5
3	32.82		7	23.2	13	10.0
4	15.71	1	8.5	11	6.8
5	20.33	5	15.8	12	8.2

您可以使用fuzzyjoin包来保留RT大于rt_stand的行，并为每个sample保留最接近的值。

library(dplyr)
fuzzyjoin::fuzzy_inner_join(df1, df2, by = c('RT' = 'rt_stand'), 
match_fun = `>`) %>%
arrange(sample, rt_stand) %>%
group_by(sample) %>%
slice(n()) %>% 
ungroup -> result
result
# A tibble: 4 x 6
#  sample    RT   Hit rt_stand n_carb below
#   <dbl> <dbl> <dbl>    <dbl>  <dbl> <dbl>
#1      2  4.52     1      4.5     10   5.5
#2      3 32.8      7     23.2     13  10  
#3      4 15.7      1      8.5     11   6.8
#4      5 20.3      5     15.8     12   8.2

注意，上述result不具有sample = 1，因为对于该sample不存在大于rt_stand的RT值。要获得所有sample值，您可以执行以下操作：

df1 %>%
filter(!sample %in% result$sample) %>%
bind_rows(result)
#  sample    RT Hit rt_stand n_carb below
#1      1  3.88   2       NA     NA    NA
#2      2  4.52   1      4.5     10   5.5
#3      3 32.82   7     23.2     13  10.0
#4      4 15.71   1      8.5     11   6.8
#5      5 20.33   5     15.8     12   8.2

相关内容

最新更新

热门标签：