选择数据-首次输入+设置时间段(1年)R



我有一个关于一组个体的数据集,每个个体在不同的时间开始收集。

我需要对第一次输入后1年的数据进行子集,如下所示:myData[myDate >= "first entry" & myDate += "1 year"]

示例数据:

df_date <- data.frame( Name = c("Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim",
"Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue"),
Dates = c("2010-1-1", "2010-2-2", "2010-3-5","2010-4-17","2010-5-20",
"2010-6-29","2010-7-6","2010-8-9","2010-9-16","2010-10-28","2010-11-16","2010-12-28","2011-1-16","2011-2-28",
"2010-4-1", "2010-5-2", "2010-6-5","2010-7-17","2010-8-20",
"2010-9-29","2010-10-6","2010-11-9","2012-12-16","2011-1-28","2011-2-28","2011-3-28","2011-2-28","2011-3-28"),
Event = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) )

期望的输出将是Jim将具有来自1/1/2010 - 12/28/2010的数据和Sue来自4/4/2010 - 3/28/2011的数据等等;20个样本,均在不同时间开始。

使用tidyverselubridate函数的组合:

library(tidyverse)
library(lubridate)
df_date %>%
mutate(Dates = as_datetime(Dates)) %>%
group_by(Name) %>%
arrange(Dates, .by_group = T) %>%
filter(Dates <= first(Dates) + duration(1, units = "year"))

类似于Martin C.Arnold的答案,我得到了另一个基于dplyrlubridate的答案。min(Dates) + years(1)表示在最短日期的基础上增加一年。

library(dplyr)
library(lubridate)
df_date2 <- df_date %>%
mutate(Dates = ymd(Dates)) %>%
group_by(Name) %>%
filter(Dates <= min(Dates) + years(1)) %>%
ungroup()

相关内容

  • 没有找到相关文章

最新更新