如何计算R中特定条件下的观测数?



我有一个这样的数据集:

data <- data.frame(ID = c(1,1,1,1,1,2,2,2,2),
year = c(1,2,3,4,5,1,2,3,4),
score = c(0.89943475,-3.51761975,1.54511640,-1.38284380,2.45591240,-1.89925250,0.83935451,-0.61843636,-0.70421765)
ID, year, score
1, 1, 0.89943475
1, 2, -3.51761975
1, 3, 1.54511640
1, 4, -1.38284380
1, 5, 2.45591240
2, 1, -1.89925250
2, 2, 0.83935451
2, 3, -0.61843636
2, 4, -0.70421765

我想创建一个数据表,它汇总上述数据,并计算score为正或负时ID的观测次数,如下所示:

ID, pos, neg, total
1,   3,   2,     5
2,   1,   3,     4

是否可以在R中使用data.table?

替代akrun的答案:

data[, .(pos = sum(score >= 0), neg = sum(score < 0), total = .N), by = ID]
#       ID   pos   neg total
#    <num> <int> <int> <int>
# 1:     1     3     2     5
# 2:     2     1     3     4

数据
data <- setDT(structure(list(ID = c(1, 1, 1, 1, 1, 2, 2, 2, 2), year = c(1, 2, 3, 4, 5, 1, 2, 3, 4), score = c(0.89943475, -3.51761975, 1.5451164, -1.3828438, 2.4559124, -1.8992525, 0.83935451, -0.61843636, -0.70421765)), class = c("data.table", "data.frame"), row.names = c(NA, -9L)))

我们可以使用dcastsign

library(data.table)
dcast(setDT(data), ID ~ sign(score), fun.aggregate = length)[,
total := rowSums(.SD), .SDcols = -1][]

与产出

ID -1 1 total
1:  1  2 3     5
2:  2  3 1     4

最新更新