r语言 - left_join,但将NA作为通配符匹配



是否有办法让left_join将任何NA值视为通配符(即匹配所有)?

例如我有一个标题df_x:

df_x<-tribble(
~`Employee Type`, ~`Employee Name`, ~`Employee Department`,
"Manager"     , "Bob"         , "Accounting",
"Junior"      , "Keith"       , "Accounting",
"Manager"     , "Alice"       , "Finance"   ,
"Junior"      , "Robert"      , "Finance"   ,
"Manager"     , "Claire"      , "I.T."     ,
"Senior"      , "Ashley"      , "I.T."     ,
"Junior"      , "Sam"         , "I.T."     ,
"Junior"      , "Joe"         , "I.T."     
)

和另一个标签df_y,指定不同部门不同角色的工资,其中所有经理都应该得到60000,无论他们在哪个部门工作:

df_y<-tribble(
~`Employee Type`, ~`Employee Department`, ~Salary,
"Manager"     , NA                  , 60000  ,
"Senior"      , "I.T."              , 40000  ,
"Junior"      , "I.T."              , 30000  ,
"Junior"      , "Finance"           , 35000  ,
"Junior"      , "Accounting"        , 35000  
)

我想要的是,在left_join之后,我将得到:

> left_join(df_x,df_y, *magic argument here*)
Joining, by = c("Employee Type", "Employee Department")
# A tibble: 8 x 4
`Employee Type` `Employee Name` `Employee Department` Salary
<chr>           <chr>           <chr>                  <dbl>
1 Manager         Bob             Accounting             60000  
2 Junior          Keith           Accounting             35000
3 Manager         Alice           Finance                60000  
4 Junior          Robert          Finance                35000
5 Manager         Claire          I.T.                   60000
6 Senior          Ashley          I.T.                   40000
7 Junior          Sam             I.T.                   30000
8 Junior          Joe             I.T.                   30000

但是我实际得到的是:

> left_join(df_x,df_y)
Joining, by = c("Employee Type", "Employee Department")
# A tibble: 8 x 4
`Employee Type` `Employee Name` `Employee Department` Salary
<chr>           <chr>           <chr>                  <dbl>
1 Manager         Bob             Accounting                NA
2 Junior          Keith           Accounting             35000
3 Manager         Alice           Finance                   NA
4 Junior          Robert          Finance                35000
5 Manager         Claire          I.T.                      NA
6 Senior          Ashley          I.T.                   40000
7 Junior          Sam             I.T.                   30000
8 Junior          Joe             I.T.                   30000

是否有办法指定"如果部门是NA,那么无论如何都应用查找"。还是我看错了,用错了函数?还有比这更"整洁"的宇宙吗?如何解决这个问题?

我将只在类型上进行连接,然后在后面进行过滤:

df_x %>%
inner_join(df_y, by = "Employee Type", suffix = c("", "_y")) %>%
filter(is.na(`Employee Department_y`) | `Employee Department` == `Employee Department_y`) %>%
select(-`Employee Department_y`)

一个可能的解决方案:

library(dplyr)
left_join(df_x,df_y) %>% 
mutate(Salary = ifelse(is.na(Salary),60000,Salary))

相关内容

  • 没有找到相关文章

最新更新