我有一个数据框架,如
Groups Names
G1 SP1
G1 SP2
G1 SP3
G2 SP1
G2 SP4
G3 SP2
G3 SP1
我想把它转换成:
Names G1 G2 G3
SP1 1 1 1
SP2 1 0 1
SP3 1 0 0
SP4 0 1 0
其中列中为Groups
cell1 = presentand0 = absent
输入格式
structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3",
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))
使用table
:
table(df$Names, df$Groups)
G1 G2 G3
SP1 1 1 1
SP2 1 0 1
SP3 1 0 0
SP4 0 1 0
将评论扩展为答案
这被称为列联表,可以用几种方法计算,而不需要使用花哨的包。
dat <- structure(list(Groups = c("G1", "G1", "G1", "G2", "G2", "G3",
"G3"), Names = c("SP1", "SP2", "SP3", "SP1", "SP4", "SP2", "SP1"
)), class = "data.frame", row.names = c(NA, -7L))
mat1 <- with(dat, table(Names, Groups))
# Groups
#Names G1 G2 G3
# SP1 1 1 1
# SP2 1 0 1
# SP3 1 0 0
# SP4 0 1 0
mat2 <- xtabs(~ Names + Groups, dat)
# Groups
#Names G1 G2 G3
# SP1 1 1 1
# SP2 1 0 1
# SP3 1 0 0
# SP4 0 1 0
这样的表是矩阵。如果你想要一个数据帧,使用:
强制它们data.frame(unclass(mat1))
# G1 G2 G3
#SP1 1 1 1
#SP2 1 0 1
#SP3 1 0 0
#SP4 0 1 0
data.frame(unclass(mat2))
# G1 G2 G3
#SP1 1 1 1
#SP2 1 0 1
#SP3 1 0 0
#SP4 0 1 0
备注:
在您的情况下,您的数据帧应该没有重复的行,否则列联表将不只是包含0和1。从这个意义上讲,计算列联表实际上有些小题大做。一种算法上更简单的方法(尽管需要更多的代码行)是:
m1 <- unique(dat$Names)
m2 <- unique(dat$Groups)
mat <- matrix(0, length(m1), length(m2), dimnames = list(m1, m2))
mat[with(dat, cbind(Names, Groups))] <- 1
# G1 G2 G3
#SP1 1 1 1
#SP2 1 0 1
#SP3 1 0 0
#SP4 0 1 0
您可以使用table
而不是df
> t(table(df))
Groups
Names G1 G2 G3
SP1 1 1 1
SP2 1 0 1
SP3 1 0 0
SP4 0 1 0
或
> table(rev(df))
Groups
Names G1 G2 G3
SP1 1 1 1
SP2 1 0 1
SP3 1 0 0
SP4 0 1 0