r-使用dplyr删除公共的特定行

  • 本文关键字:使用 dplyr 删除 r join dplyr
  • 更新时间 :
  • 英文 :


所以假设我有一个如下的df:

DF1

ID    Name       Age    Color
001   Stephanie  23     Red
002   DJ         
003   Michelle   20     Blue
004   Danny      40     Green
005   Joey       
006   Jesse 
005   Jesse     

让我们假设数据是1000+obs和30+vars

假设我对所有缺少信息的变量进行了子集

DF2(DF1的子集(

ID    Name       Age    Color
002   DJ         
005   Joey       
006   Jesse  
005   Jesse

在这份表格中,我找到了一些信息。

静止DF2

ID    Name       Age    Color
002   DJ         
005   Joey       42     Yellow
006   Jesse      39     Black
005   Jesse

我如何更新原始数据中的行,以获得我只能作为子集更新的所有信息。最终df应为:

ID    Name       Age    Color
001   Stephanie  23     Red
002   DJ         
003   Michelle   20     Blue
004   Danny      40     Green
005   Joey       42     Yellow
006   Jesse      39     Black
005   Jesse

添加一个new_id列以在上明显匹配如何

#recreating dfs
df1 <- data.frame(ID = c(001, 002, 003, 004, 005, 006, 005),
Name = c("Stephanie", "DJ", "Michelle", "Danny", "Joey", "Jesse", "Jesse"),
Age = c(23,"", 20, 40, "", "", ""),
Color = c("Red", "", "Blue", "Green", "","",""), stringsAsFactors = FALSE)
new_id <- 1:nrow(df1)
df1 <- cbind(new_id, df1)
df2 <- data.frame(new_id = c(2,5,6,7),
ID = c(002, 005, 006, 005),
Name = c("DJ","Joey", "Jesse", "Jesse"),
Age = c("",42, 39, ""),
Color = c("", "Yellow","Black",""), stringsAsFactors = FALSE)

然后使用它作为索引来指定要替换的行:

#add rows back in
for (i in 1:nrow(df1)){
if (df1$new_id[i] %in% df2$new_id){
df1[i,] <- df2[which(df2$new_id == df1$new_id[i]),]
}
}
> df1
new_id ID      Name Age  Color
1      1  1 Stephanie  23    Red
2      2  2        DJ           
3      3  3  Michelle  20   Blue
4      4  4     Danny  40  Green
5      5  5      Joey  42 Yellow
6      6  6     Jesse  39  Black
7      7  5     Jesse 

或者,dplyr解决方案(使用相同的df(:

> df2 %>% union(df1) %>% distinct(ID, Name, .keep_all = T) %>% arrange(new_id)
new_id ID      Name Age  Color
1      1  1 Stephanie  23    Red
2      2  2        DJ           
3      3  3  Michelle  20   Blue
4      4  4     Danny  40  Green
5      5  5      Joey  42 Yellow
6      6  6     Jesse  39  Black
7      7  5     Jesse      

最新更新