r语言 - 比较事件是新的还是已经存在的



有一种算法可以识别网络中的问题/事件。之后,它将所有的案例写入一个DB。让我们假设它看起来像(简化):

情况下3NANANANANA

这个怎么样:

df[order(Date), d:=c(1,diff(Date)), by = ID][
order(d,ID),case:=rleid(ID,d)][
,d:=NULL]

输出:

ID       Date case
1: A1 2022-01-01    1
2: B1 2022-01-01    2
3: C1 2022-01-01    3
4: A1 2022-01-02    1
5: C1 2022-01-02    3
6: A1 2022-01-03    1
7: B1 2022-01-03    4
8: C1 2022-01-03    3

如果您确实需要注释列,您可以对上面的代码进行细化,像这样:

df[order(Date), d:=c(1,diff(Date)), by = ID][
order(d,ID),`:=`(
case=rleid(ID,d),
comment=fifelse(d!=1,paste0("New case, as there was no ", ID, " on ",Date-1),""))][
,d:=NULL][]

输出:

ID       Date case                                    comment
1: A1 2022-01-01    1                                           
2: B1 2022-01-01    2                                           
3: C1 2022-01-01    3                                           
4: A1 2022-01-02    1                                           
5: C1 2022-01-02    3                                           
6: A1 2022-01-03    1                                           
7: B1 2022-01-03    4 New case, as there was no B1 on 2022-01-02
8: C1 2022-01-03    3                                           

相关内容

最新更新