R用单独数据框架中列的非NA值覆盖列值



我有一个数据框架'df1',有很多列,但感兴趣的是:

<表类> 数量代码tbody><<tr>1231011银行42772100BLPH

我们可以做

library(powerjoin)
power_left_join(df1, df2, by = "Number", conflict = coalesce)

与产出

Number Code
1      1 AMCR
2      2 AMCR
3      3 BANO
4     10 BAEA
5     11 AMRO
6      4 <NA>
7    277 <NA>
8   2100 BLPH

或者使用data.table

进行覆盖
library(data.table)
setDT(df1)[df2, Code := fcoalesce(Code, i.Code), on = .(Number)]

与产出

> df1
Number   Code
<int> <char>
1:      1   AMCR
2:      2   AMCR
3:      3   BANO
4:     10   BAEA
5:     11   AMRO
6:      4   <NA>
7:    277   <NA>
8:   2100   BLPH

数据
df1 <- structure(list(Number = c(1L, 2L, 3L, 10L, 11L, 4L, 277L, 2100L
), Code = c(NA, NA, NA, NA, "AMRO", NA, NA, "BLPH")), 
class = "data.frame", row.names = c(NA, 
-8L))
df2 <- structure(list(Number = c(1L, 2L, 3L, 10L, 12L, 4L, 277L, 2100L
), Code = c("AMCR", "AMCR", "BANO", "BAEA", "AMRO", NA, NA, NA
)), class = "data.frame", row.names = c(NA, -8L))

这是使用bind_cols的另一种方法:

library(dplyr)
bind_cols(df1, df2) %>% 
mutate(Code = coalesce(Code...2, Code...4)) %>% 
select(Number = Number...1, Code)
Number Code
1      1 AMCR
2      2 AMCR
3      3 BANO
4     10 BAEA
5     11 AMRO
6      4 <NA>
7    277 <NA>
8   2100 BLPH

这是dplyrfull_joininner_join的解决方案

library(dplyr)
df1 %>% 
full_join(df2) %>% na.omit() %>% 
full_join(df1 %>% inner_join(df2)) %>% 
filter(Number %in% df1$Number) %>%
arrange(Number)

输出

#>   Number Code
#> 1      1 AMCR
#> 2      2 AMCR
#> 3      3 BANO
#> 4      4 <NA>
#> 5     10 BAEA
#> 6     11 AMRO
#> 7    277 <NA>
#> 8   2100 BLPH

最新更新