r语言 - 将矩阵行与另一行矩阵进行比较



我有两个矩阵,一个来自实验(df1(,另一个是参考(df2(。它们是来自标本的半定量值,从 1 到 50。我想比较实验中 df1 的每一行,这些值是否都为 True(与引用相同(。

df1:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,]    6   14   32   38   40   48
[2,]    1   12   17   20   36   47
[3,]    7   15   29   33   40   42
[4,]    7   13   28   33   35   48
[5,]    1    2   13   36   38   41
[6,]   12   20   37   38   41   48
[7,]   13   14   28   34   36   43
...more rows
df2:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,]    5   12   14   15   24   32
[2,]    4    5   13   22   34   47
[3,]    1   14   24   29   34   36
[4,]    7   13   28   33   35   48
[5,]   13   14   28   34   36   43
[6,]    4   10   13   17   29   30
[7,]    4   15   22   30   36   43
[8,]    1   11   18   36   41   48
[9,]   14   17   18   24   43   47
[10,]   13   24   32   34   41   47
...more rows
desired output:
V1  V2   V3   V4   V5   V6   V7
7   13   28   33   35   48   TRUE
13   14   28   34   36   43   TRUE

如何将一个矩阵的所有行与另一个矩阵进行比较以对所有 TRUE 行进行排序?谢谢。

使用for()which()%in%的替代方法:

# For reproducibility these random matrices usually have >1 match for example
# Run again if not.
data1 <- matrix(sample(c(0,1),60, replace = TRUE),ncol = 5)
data2 <- matrix(sample(c(0,1),60, replace = TRUE),ncol = 5)

# You can use some 'helper' character string vectors
data1.str <- apply(data1, 1, paste0, collapse="")
data2.str <- apply(data2, 1, paste0, collapse="")
data.match <- c()
for(i in 1:length(data1.str)){
data.match <- append(data.match, which(data1.str %in% data2.str[i]))
} 
# Gives your matched rows already
data1[data.match,]
# For completeness to give desired output:
matched <- as.data.frame(data1)
matched$data.match <- rep(FALSE,nrow(matched))
matched$data.match[data.match] <- TRUE
> matched[which(matched$data.match == TRUE),]
V1 V2 V3 V4 V5 data.match
4   1  1  0  0  1       TRUE
6   0  1  1  1  1       TRUE
7   1  1  0  0  0       TRUE
9   0  0  0  0  0       TRUE
10  0  1  0  0  1       TRUE

这是这样做的一种方法 -

x <- matrix(1:4, nrow=2)
[,1] [,2]
[1,]    1    3
[2,]    2    4
y <- matrix(c(1,2,5,4), nrow=2)
[,1] [,2]
[1,]    1    5
[2,]    2    4
do.call(paste, as.data.frame(x)) %in% do.call(paste, as.data.frame(y))
FALSE  TRUE

我猜这应该比按所有列执行inner_join更快。

相关内容

  • 没有找到相关文章

最新更新