我需要检查a变量ppt中每个唯一值的元素的数量是否等于b中ppt中每个唯一值的元素数量,以及是否存在仅a或仅针对b。
独有的任何值例如:
PPTa <- c("ppt0100109","ppt0301104","ppt0100109","ppt0100109","ppt0300249","ppt0100109","ppt0300249","ppt0100109","ppt0504409","ppt2303401","ppt0704210","ppt0704210","ppt0100109")
CNa <- c(110,54,110,110,49,10,49,110,409,40,10,10,110)
LLa <- c(150,55,150,150,45,15,45,115,405,45,5,15,50)
A <-data.frame(PPTa,CNa,LLa)
PPTb <- c("ppt0100200","ppt0300249","ppt0100109","ppt0300249","ppt0100109","ppt0764091","ppt2303401","ppt0704210","ppt0704210","ppt0100109")
CNb <- c(110,54,110,110,49,10,49,110,409,40)
LLb <- c(150,55,150,150,45,15,45,115,405,45)
B <-data.frame(PPTb,CNb,LLb)
在这种情况下,我们有这些独特的值,这些值发生了一定次:
A$PPTa TIMES
"ppt0100109" 6
"ppt0301104" 1
"ppt0300249" 2
"ppt0504409" 1
"ppt2303401" 1
"ppt0704210" 2
B$PPTb TIMES
"ppt0100200" 1
"ppt0300249" 2
"ppt0100109" 3
"ppt0764091" 1
"ppt2303401" 1
"ppt0704210" 2
我想创建一个新矩阵(或您可以建议的任何内容(,如果唯一值在A和B中都存在,则具有相同数量的元素,如果两个dataFrames中存在,则值为1,则值为1a和b,但元素的数量有所不同,如果仅在两个数据范围之一中存在该值,则值为2。类似:
A$PPTa TIMES OUTPUT
"ppt0100109" 6 1
"ppt0301104" 1 2
"ppt0300249" 2 0
"ppt0504409" 1 2
"ppt2303401" 1 0
"ppt0704210" 2 0
B$PPTb TIMES OUTPUT
"ppt0100200" 1 2
"ppt0300249" 2 0
"ppt0100109" 3 1
"ppt0764091" 1 2
"ppt2303401" 1 0
"ppt0704210" 2 0
您可以使用嵌套的ifelse
语句,
ifelse(do.call(paste0, A) %in% do.call(paste0, B), 0, ifelse(A$PPTa %in% B$PPTb, 1, 2))
#[1] 1 0 2 2 0 0
ifelse(do.call(paste0, B) %in% do.call(paste0, A), 0, ifelse(B$PPTb %in% A$PPTa, 1, 2))
#[1] 1 2 0 0 2 0