r语言 - 根据常见行匹配2个数据帧,并保留行名的顺序



我需要修改数据框架"DF1"通过将其第一列(且唯一)与"DF2"的第二列匹配并通过打印匹配的列,保持行名的顺序。在DF1。我还需要用0替换不匹配的行。以下是我所拥有的两个数据帧示例:

"DF1">

Ccd
Kkl
Sop
Mnn
Msg
Xxy
Zxz
Ccd
Msg

"DF2">

3   Ab
5   Abc
5   Ccd
9   Kkl
5   Msg
13  Sop
19  Klj

代码
read.table("a.txt")->DF1
read.table("b.txt")->DF2
colnames(DF1)<-c("b")
colnames(DF2)<-c("a", "b")
DF3 <- merge(DF1,DF2, by="b", all.x=TRUE) #
DF3$a[is.na(DF3$a)] <- 0 #substitute NA with 0
我从上面的代码得到的输出是:
b  a
Ccd  5
Ccd  5
Kkl  9
Mnn  0
Msg  5
Msg  5
Sop 13
Xxy  0
Zxz  0

我实际需要的输出是:

Ccd  5
Kkl  9
Sop  13
Mnn  0
Msg  5
Xxy  0
Zxz  0
Ccd  5
Msg  5

带数据。表中,您可以这样做:

library(data.table)
setDT(df2)[setDT(df1),,on="b"][is.na(a), a:=0][]

输出:

a   b
1:  5 Ccd
2:  9 Kkl
3: 13 Sop
4:  0 Mnn
5:  5 Msg
6:  0 Xxy
7:  0 Zxz
8:  5 Ccd
9:  5 Msg

或与dplyr:

library(dplyr)
left_join(df1,df2, by="b") %>% mutate(a=if_else(is.na(a),0,as.double(a)))

输出:

b  a
1: Ccd  5
2: Kkl  9
3: Sop 13
4: Mnn  0
5: Msg  5
6: Xxy  0
7: Zxz  0
8: Ccd  5
9: Msg  5

输入:

df1 <- structure(list(b = c("Ccd", "Kkl", "Sop", "Mnn", "Msg", "Xxy", 
"Zxz", "Ccd", "Msg")), row.names = c(NA, -9L), class = "data.frame")
df2 <- structure(list(a = c(3L, 5L, 5L, 9L, 5L, 13L, 19L), b = c("Ab", 
"Abc", "Ccd", "Kkl", "Msg", "Sop", "Klj")), row.names = c(NA, 
-7L), class = "data.frame")

最新更新