我有两个不同的数据帧DF1和DF2。我想排除与数据帧DF2匹配的DF1的行,并且我得到的数据帧看起来像DF3。此外我想通过条件,因为如果我的房间号是所有房间,那么我可以将DF2到DF1的代码、说明和公司列匹配,如果我的房号列不包含所有房间那么它应该匹配编码、说明、公司和房间号栏。
Code=c("A","B","C","E","D")
Desciption=c("Color is not Good","Odour is not good","Astetic Issue","Odour is not good","Lighting issue")
Company=c("Asian Paints","Burger","Asian Paints","Burger","Burger")
`Room number`=c("Room_1","Room_1","Room_2","Room_3","Room_2")
Rating=c("2","3","5","4","3")
DF1=data.frame(Code,Desciption,Company,`Room number`,Rating)
Code Desciption Company Room.number Rating
1 A Color is not Good Asian Paints Room_1 2
2 B Odour is not good Burger Room_1 3
3 C Astetic Issue Asian Paints Room_2 5
4 E Odour is not good Burger Room_3 4
5 D Lighting issue Burger Room_2 3
Code=c("A","B")
Desciption=c("Color is not Good","Odour is not good")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_1","All Rooms")
DF2=data.frame(Code,Desciption,Company,`Room number`)
> DF2
Code Desciption Company Room.number
1 A Color is not Good Asian Paints Room_1
2 B Odour is not good Burger All Rooms
Code=c("C","D")
Desciption=c("Astetic Issue","Lighting issue")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_2","Room_2")
Rating=c("5","3")
DF3=data.frame(Code,Desciption,Company,`Room number`,Rating)
> DF3
Code Desciption Company Room.number Rating
1 C Astetic Issue Asian Paints Room_2 5
2 D Lighting issue Burger Room_2 3
有人能帮我解决这个吗
您已经提到
此外,我希望通过条件,因为如果我的房间号是All Rooms,那么我将能够将列Code、Description和Company从DF2匹配到DF1,。。
不清楚在这种特定情况下(所有房间(,是否要检查DF1
中所有codes
的description & company
?如果是,以下语法即可。
否则,如果必须在DF1
中检查所有可能的组合所有列(即code
、description
和company
(中的所有组合,则过滤的行将是0
。请重新检查你的逻辑,并根据修改问题
DF1 %>% anti_join(DF2, by = c("Code", "Desciption", "Company", "Room.number")) %>%
anti_join(DF2 %>% filter(Room.number == "All Rooms") %>%
mutate(Code = list(unique(DF1$Code))) %>%
unnest_longer(Code) ,
by = c("Code", "Desciption", "Company"))
Code Desciption Company Room.number Rating
1 C Astetic Issue Asian Paints Room_2 5
2 D Lighting issue Burger Room_2 3
这里是一种基本的R矢量化方法,用于过滤出匹配多个条件的行。它创建逻辑索引,然后根据这些索引对DF1
进行子集设置。DF3b
和预期结果DF3
之间的唯一区别在于行名,所以我将它们设置为连续值。
i_all_rooms <- DF1[["Room.number"]] == "All Rooms"
i1 <- !DF1[["Code"]] %in% DF2[["Code"]]
i2 <- !DF1[["Desciption"]] %in% DF2[["Desciption"]]
i3 <- !DF1[["Company"]] %in% DF2[["Company"]]
i4 <- !DF1[["Room.number"]] %in% DF2[["Room.number"]]
j1 <- i_all_rooms & i1 & (i2 | i3)
j2 <- !i_all_rooms & i1 & (i2 | i3) & i4
DF3b <- DF1[j1 | j2, ]
row.names(DF3b) <- NULL
identical(DF3, DF3b)
#[1] TRUE