比较r中两个字符串中的元素进行条件选择



我有以下三个变量的数据集:

X<-c(0.1,0.3,0.3,0.4,0.8,0.8,1.1,1.2,1.3,1.6,2.1,2.2,2.3,2.4,2.6,2.8,3.1,3.3,3.4,4.1,4.4,4.4,4.5,5.0,5.1,5.2,5.3,5.4,5.4,5.7,6.2,6.5,6.6,6.7,6.7,7.0,7.4,7.5,7.8,7.8,8.6,9.5,9.8,11.1,11.9)
Y<-c("ac","bcd","ac","ab","ab","d","ab","ab","cd","bcd","d","ad","ad","d","ad","ad","ad","ab","ad","a", "ad","ac","a", "bcd", "ac","d", "ac","ac","bcd","ab", "ab","ab","cd","ac","ad","ab","d","d", "ab","d", "d", "bcd","a", "a","d")
Z<-c("ac","bcd", "ab","ac","ab","cd","ac","ac","bcd" ,"cd","bcd" ,"ac","ac","bcd","ab","bcd", "bcd", "a", "ab","ab","cd","a", "ac","ac","bcd" ,"ad","bcd", "bcd" ,"ab","bcd",
"bcd", "bcd", "ac","cd","a", "cd","ac","ac","cd","ab","ab","a", "bcd", "cd","a")
df<-data.frame(X,Y,Z)
  1. 第一步是确定Y和z中是否存在相同的字母。
  2. 第二步可以使用ifelse来选择包含Y和Z列中的字母的最小X值和不包含Y或Z列中的字符的最大X值。

最后的结果应该是:

6.7, 7.0

使用前面答案中的代码,我认为这应该可以工作:

library(stringr)
X<-c(0.1,0.3,0.3,0.4,0.8,0.8,1.1,1.2,1.3,1.6,2.1,2.2,2.3,2.4,2.6,2.8,3.1,3.3,3.4,4.1,4.4,4.4,4.5,5.0,5.1,5.2,5.3,5.4,5.4,5.7,6.2,6.5,6.6,6.7,6.7,7.0,7.4,7.5,7.8,7.8,8.6,9.5,9.8,11.1,11.9)
Y<-c("ac","bcd","ac","ab","ab","d","ab","ab","cd","bcd","d","ad","ad","d","ad","ad","ad","ab","ad","a", "ad","ac","a", "bcd", "ac","d", "ac","ac","bcd","ab", "ab","ab","cd","ac","ad","ab","d","d", "ab","d", "d", "bcd","a", "a","d")
Z<-c("ac","bcd", "ab","ac","ab","cd","ac","ac","bcd" ,"cd","bcd" ,"ac","ac","bcd","ab","bcd", "bcd", "a", "ab","ab","cd","a", "ac","ac","bcd" ,"ad","bcd", "bcd" ,"ab","bcd",
"bcd", "bcd", "ac","cd","a", "cd","ac","ac","cd","ab","ab","a", "bcd", "cd","a")
df<-data.frame(X,Y,Z)
df$YZ <- gsub(" ", "", paste(df$Y, df$Z))
df$unique <- !sapply(df$YZ, function(x) any(str_count(x, letters)>1))
print(max(df[df$unique == FALSE,]$X))
print(min(df[df$unique == TRUE,]$X))
[1] 6.7
[1] 7

我认为包含字母Y和z的最小X值;应该是0.1

头(df)X YZ YZ唯一1 0.1 acac acac FALSE2 0.3 bcdbcd bdbcd FALSE3 0.3 acab acab FALSE4 0.4 abac abac FALSE5 0.8 abab ab FALSE6 0.8 dcd dcd FALSE

Y='ac', Z='ac', Y和Z中的'a'

最新更新