找到两个数据帧中匹配的两列，并使用R将数据帧2中的第三列放入数据帧1中的新列中

我有两个数据帧：

df1:
word1 previousWord
a     na
b     a
c     b

另一个数据帧看起来像这个

df2: this contains more pairs than exist in df1. It contains every combo possible
word1 previousWord Score
a     a            1
a     b            .5
a     c            .9
b     a            .5
b     b            1
b     c            .2
c     a            .9
c     b            .2
c     c            1

我想找出df2中来自df1的对(即b-a，c-b(的时间，并复制df2中的分数，并将其添加到df1中的新列中。

例如，输出如下：

word1 previousWord Score
a     na           na
b     a            .5
c     b            .2

这是我尝试过的，但它似乎从df1中删除了我的许多数据。改变顺序并没有消除这个问题。

df3<-merge(df2, df1, by = c("word1", "previousWord"))

非常感谢您的帮助。

您可以在此处从dplyr使用left_join()。

library(dplyr)
df3<- left_join(df1, df2, by = c("word1", "previousWord"))

输出

word1 previousWord Score
1     a         <NA>    NA
2     b            a   0.5
3     c            b   0.2

数据

df1 <- structure(list(word1 = c("a", "b", "c"), previousWord = c(NA, 
"a", "b")), class = "data.frame", row.names = c(NA, -3L))
df2 <- structure(list(word1 = c("a", "a", "a", "b", "b", "b", "c", "c", 
"c"), previousWord = c("a", "b", "c", "a", "b", "c", "a", "b", 
"c"), Score = c(1, 0.5, 0.9, 0.5, 1, 0.2, 0.9, 0.2, 1)), class = "data.frame", row.names = c(NA, 
-9L))

相关内容

最新更新

热门标签：