将数据帧转换为r中的二进制数据帧



我有一个数据框架,如

Groups Names
G1     SP1
G1     SP2
G1     SP3
G2     SP1
G2     SP4
G3     SP2
G3     SP1 

我想把它转换成:

Names G1 G2 G3
SP1   1  1  1
SP2   1  0  1  
SP3   1  0  0 
SP4   0  1  0

其中列中为Groupscell1 = presentand0 = absent

输入格式

structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3", 
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))

使用table:

table(df$Names, df$Groups)

G1 G2 G3
SP1  1  1  1
SP2  1  0  1
SP3  1  0  0
SP4  0  1  0

将评论扩展为答案

这被称为列联表,可以用几种方法计算,而不需要使用花哨的包。

dat <- structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3", 
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))
mat1 <- with(dat, table(Names, Groups))
#     Groups
#Names G1 G2 G3
#  SP1  1  1  1
#  SP2  1  0  1
#  SP3  1  0  0
#  SP4  0  1  0
mat2 <- xtabs(~ Names + Groups, dat)
#     Groups
#Names G1 G2 G3
#  SP1  1  1  1
#  SP2  1  0  1
#  SP3  1  0  0
#  SP4  0  1  0

这样的表是矩阵。如果你想要一个数据帧,使用:

强制它们
data.frame(unclass(mat1))
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0
data.frame(unclass(mat2))
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

备注:

在您的情况下,您的数据帧应该没有重复的行,否则列联表将不只是包含0和1。从这个意义上讲,计算列联表实际上有些小题大做。一种算法上更简单的方法(尽管需要更多的代码行)是:

m1 <- unique(dat$Names)
m2 <- unique(dat$Groups)
mat <- matrix(0, length(m1), length(m2), dimnames = list(m1, m2))
mat[with(dat, cbind(Names, Groups))] <- 1
#    G1 G2 G3
#SP1  1  1  1
#SP2  1  0  1
#SP3  1  0  0
#SP4  0  1  0

您可以使用table而不是df

> t(table(df))
Groups
Names G1 G2 G3
SP1  1  1  1
SP2  1  0  1
SP3  1  0  0
SP4  0  1  0

> table(rev(df))
Groups
Names G1 G2 G3
SP1  1  1  1
SP2  1  0  1
SP3  1  0  0
SP4  0  1  0

相关内容

  • 没有找到相关文章

最新更新