我有一个包含日期和多个事件的数据集:
date number_of_events
1/14/2013 1
2/6/2013 1
6/5/2013 1
7/1/2013 2
7/15/2013 1
7/19/2013 1
8/1/2013 2
我想计算事件数为2(或任何其他预先确定的值(的日期之前的天数。
这是我的目标。。。
date number_of_events days_to_two_events
1/14/2013 1 168
2/6/2013 1 145
6/5/2013 1 26
7/1/2013 2 31
7/15/2013 1 17
7/19/2013 1 13
8/1/2013 2 0
使用dplyr
和zoo
:
df <- read.table(text = "date number_of_events
1/14/2013 1
2/6/2013 1
6/5/2013 1
7/1/2013 2
7/15/2013 1
7/19/2013 1
8/1/2013 2", header= T)
library(dplyr)
library(zoo)
df %>%
mutate(days_to_two_events = na.locf0(ifelse(lead(number_of_events, 1) == 2, lead(date, 1), NA), fromLast = TRUE)) %>%
mutate(days_to_two_events = as.Date(days_to_two_events, format = "%m/%d/%Y")-as.Date(date, format = "%m/%d/%Y"))
为了完整性,这里有两个使用向后滚动连接的解决方案。
这两种解决方案在计算具有number_of_events == 2
的行的日差的方式上有所不同。第一个解决方案计算相对于下一行的日差,number_of_events == 2
包括其自身。因此,在这种情况下,days_to_two_events
为零。
第二个解决方案计算相对于下一行的日差,number_of_events == 2
不包括其自身。因此,对于具有number_of_events == 2
的行,它查找与具有number_of_events == 2
的后续行的日差。这是预期的结果。
两种变体都假定df
已经被date
排序。
第一种变体
library(data.table)
setDT(df)[, date := lubridate::mdy(date)]
df[number_of_events == 2][
df, on = "date", roll = -Inf,
.(date, number_of_events = i.number_of_events, days_to_two_events = x.date - date)]
date number_of_events days_to_two_events 1: 2013-01-14 1 168 days 2: 2013-02-06 1 145 days 3: 2013-06-05 1 26 days 4: 2013-07-01 2 0 days 5: 2013-07-15 1 17 days 6: 2013-07-19 1 13 days 7: 2013-08-01 2 0 days
第二个变体(预期结果(
library(data.table)
setDT(df)[, date := lubridate::mdy(date)]
df[number_of_events == 2][
df, on = "date", roll = -Inf,
.(date, number_of_events = i.number_of_events,
days_to_two_events = shift(x.date, -1, fill = last(x.date)) - date)]
date number_of_events days_to_two_events 1: 2013-01-14 1 168 days 2: 2013-02-06 1 145 days 3: 2013-06-05 1 26 days 4: 2013-07-01 2 31 days 5: 2013-07-15 1 17 days 6: 2013-07-19 1 13 days 7: 2013-08-01 2 0 days
注意第4行中两个解决方案之间的差异。
数据
library(data.table)
df <- fread(text = "date number_of_events
1/14/2013 1
2/6/2013 1
6/5/2013 1
7/1/2013 2
7/15/2013 1
7/19/2013 1
8/1/2013 2")