我有多个数据帧,看起来像这样:
>df1
NAME
Josh
Sarah
Sammy
Jake
>df2
NAME
Josh
Sarah
Sammy
Mark
>df3
NAME
Josh
Michael
Mike
Adam
>df4
NAME
Josh
Michael
Mike
Adam
我想创建一个包含这些dfs的交点个数的新数据帧,就像这样
>df.final
df1 df2 df3 df4
df1 4 3 1 4
df2 3 4 1 1
df3 1 1 4 4
df4 1 1 4 4
我怎样才能做到这一点?从本质上讲,我希望自动化intersect()
和length()
功能,而无需手动输入它们。
#create the data
df1 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Jake"))
df2 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Mark"))
df3 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
df4 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
#create the data
df1 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Jake"))
df2 <- data.frame(NAME=c("Josh", "Sarah", "Sammy", "Mark"))
df3 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
df4 <- data.frame(NAME=c("Josh", "Michael", "Mike", "Adam"))
l <- c("df1","df2","df3","df4")
names(l) <- l
result <- outer(mget(l),mget(l), function(x,y)
mapply(function(x,y) length(intersect(x$NAME , y$NAME)),x,y ) )
result
#> df1 df2 df3 df4
#> df1 4 3 1 1
#> df2 3 4 1 1
#> df3 1 1 4 4
#> df4 1 1 4 4
编辑
矢量化也可以:
result <- outer(mget(l),mget(l), Vectorize(
function(x,y) length(intersect(x$NAME , y$NAME))))