下面有一个名为df的示例数据帧
subj_id admission_id chart_date admission_date procedure
9 145834 2010-10-21 2010-10-23 surgery
14 122917 2010-05-30 2010-06-10 surgery
22 205461 2010-06-01 2010-06-15 surgery
31 237766 2010-03-05 2010-03-08 surgery
49 241908 2010-04-21 2010-04-21 CT-scan
56 317751 2010-09-10 2010-09-25 surgery
67 382211 2010-08-05 2010-08-06 surgery
我只想选择其中chart_date在admission_date之后超过2天但在admission_date之后不到14天的行,其中的结果应该是这样的
subj_id admission_id chart_date admission_date procedure
9 145834 2010-10-21 2010-10-23 surgery
14 122917 2010-05-30 2010-06-10 surgery
22 205461 2010-06-01 2010-06-15 surgery
31 237766 2010-03-05 2010-03-08 surgery
我已经尝试了下面的代码,但它返回的是空行。我想知道我是否做错了什么,感谢任何形式的帮助。非常感谢。
start <-df$admission_date + 2
end <-df$admission_date + 14
dfinclusion <- df[df$chart_date > start & df$chart_date < end,]
您可以减去admission_date
和chart_date
之间的值,然后选择2到14天之间的行。
library(dplyr)
df %>%
mutate_at(vars(ends_with('date')), as.Date) %>%
filter(between(admission_date - chart_date, 2, 14))
# subj_id admission_id chart_date admission_date procedure
#1 9 145834 2010-10-21 2010-10-23 surgery
#2 14 122917 2010-05-30 2010-06-10 surgery
#3 22 205461 2010-06-01 2010-06-15 surgery
#4 31 237766 2010-03-05 2010-03-08 surgery
类似地,在碱基R:中
df[3:4] <- lapply(df[3:4], as.Date)
subset(transform(df, diff_Date = admission_date - chart_date),
diff_Date >=2 & diff_Date <= 14)
数据
df <- structure(list(subj_id = c(9L, 14L, 22L, 31L, 49L, 56L, 67L),
admission_id = c(145834L, 122917L, 205461L, 237766L, 241908L,
317751L, 382211L), chart_date = structure(c(14903, 14759,
14761, 14673, 14720, 14862, 14826), class = "Date"),
admission_date = structure(c(14905,14770, 14775, 14676, 14720, 14877, 14827),
class = "Date"), procedure = structure(c(2L, 2L, 2L, 2L, 1L, 2L, 2L),
.Label = c("CT-scan", "surgery"), class = "factor")),
row.names = c(NA, -7L), class = "data.frame")