我有一个数据框架看起来像:
S1 S2 S3 S4 S5 S6 S7 S8
15 15 15 15 15 15 15 15
3 15 15 15 7 15 15 15
15 2 1 15 9 15 15 8
15 15 15 15 15 15 15 1
15 15 1 15 15 15 15 15
我想计算每列数据中值的频率,已知值的范围(1:15)。然后,我想改变标题为(name, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)的数据帧,格式如下:
Name 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
S1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4
S2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 4
S3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 3
S4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5`
你能帮我吗?感谢。
使用table
+stack
的快速base R选项
> t(table(stack(df)))
values
ind 1 2 3 7 8 9 15
S1 0 0 1 0 0 0 4
S2 0 1 0 0 0 0 4
S3 2 0 0 0 0 0 3
S4 0 0 0 0 0 0 5
S5 0 0 0 1 0 1 3
S6 0 0 0 0 0 0 5
S7 0 0 0 0 0 0 5
S8 1 0 0 0 1 0 3
> dput(df)
structure(list(S1 = c(15L, 3L, 15L, 15L, 15L), S2 = c(15L, 15L,
2L, 15L, 15L), S3 = c(15L, 15L, 1L, 15L, 1L), S4 = c(15L, 15L,
15L, 15L, 15L), S5 = c(15L, 7L, 9L, 15L, 15L), S6 = c(15L, 15L,
15L, 15L, 15L), S7 = c(15L, 15L, 15L, 15L, 15L), S8 = c(15L,
15L, 8L, 1L, 15L)), class = "data.frame", row.names = c(NA, -5L
))
或
> t(sapply(df, function(x) table(factor(x, levels = seq(max(df))))))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
S1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4
S2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 4
S3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 3
S4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5
S5 0 0 0 0 0 0 1 0 1 0 0 0 0 0 3
S6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5
S7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5
S8 1 0 0 0 0 0 0 1 0 0 0 0 0 0 3
使用tidyverse
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = everything())%>%
pivot_wider(names_from = value, values_from = value,
values_fill = 0, values_fn = length)