r语言 - 计数先前出现的值比当前值大1 - r - Count prior occurences of value one greater than current value 小贝子编程网

我试图创建一个列，其中包含一个值大于当前值的先前出现的数量(按日期排序)。在这里提供的示例中，我在标记为"wanted"的列中手动创建我想要的值，该值等于RoundNo先前出现的次数(按"日期"排序)，这些次数等于比焦点行RoundNo大1。我需要为每个单独的investstorid按组分别计算这个。

第一行"wanted"value等于投资者1的先前RoundNo的计数，其中RoundNo == 3(也就是比第一行的RoundNo(2)大一个)。所以在这种情况下，它将是0。类似地，对于第二行，"wanted"&;value是投资者1的先前RoundNo的计数，其中RoundNo == 2(也就是比第二行的RoundNo(1)大一个)。因此在本例中，它将是1。非常感谢任何帮助。代码示例如下。谢谢!

dt = as.data.table(cbind(c(rep(1,7),rep(2,7)),
c("2019-08-01","2019-09-01","2019-10-01","2019-11-01","2019-12-01","2021-04-01","2021-10-01",
"2019-01-01","2019-02-01","2019-04-01","2019-08-01","2019-09-01","2019-10-01","2019-11-01"),
c(2,1,2,2,1,3,2,1,2,3,2,1,3,1)))
names(dt) = c("InvestorID","date","RoundNo")
wanted = c(0,1,0,0,3,0,1,0,0,0,1,2,0,2)
dt$wanted = wanted

1)定义一个函数Count，计算输入向量的每个元素等于1加上最后一个元素的次数。然后使用rollapplyr将其应用于RoundNo的连续较大的前导序列。

library(zoo)
Count <- function(x) sum(x == tail(x, 1) + 1)
dt[, wanted := rollapplyr(as.numeric(RoundNo), 1:.N, Count), by = InvestorID]

2)另一种方法是使用自左连接，其中别名为a的dt的第一个实例左连接到别名为b的dt的第二个实例，将来自相同InvestorID的b行关联在a行之前或之后。按a行分组，并对b行取相应的和。

library(sqldf)
sqldf("select a.*, sum(a.RoundNo + 1 == b.RoundNo) wanted
from dt a
left join dt b on a.InvestorID = b.InvestorID and b.rowid <= a.rowid
group by a.rowid")

3)这个替代方法只使用data.table。Count来自(1).

dt[, wanted := sapply(1:.N, function(i) Count(as.numeric(RoundNo)[1:i])), 
by = InvestorID]

使用Reduce的另一个data.table解决方案:

dt[order(date),.(date,
result=lapply(Reduce('c',as.numeric(RoundNo),accumulate=T),
function(x) sum(x==last(x)+1)),
wanted), by=InvestorID]
InvestorID       date result wanted
1:          2 2019-01-01      0      0
2:          2 2019-02-01      0      0
3:          2 2019-04-01      0      0
4:          2 2019-08-01      1      1
5:          2 2019-09-01      2      2
6:          2 2019-10-01      0      0
7:          2 2019-11-01      2      2
8:          1 2019-08-01      0      0
9:          1 2019-09-01      1      1
10:          1 2019-10-01      0      0
11:          1 2019-11-01      0      0
12:          1 2019-12-01      3      3
13:          1 2021-04-01      0      0
14:          1 2021-10-01      1      1

r语言 - 计数先前出现的值比当前值大1

相关内容

最新更新

热门标签：