根据r中的条件选择日期



我的目标是根据2个Date列的条件创建一个列。数据集看起来像这样:

df <- data.frame(PatientID = c("3454","345","5","345","567","79"),
date_of_covid_test = c(2020-04-02, 2000-03-01, 2000-01-01, 2020-11-03, 2020-04-02, 2020-12-05),
date_of_hospitalization = c(2020-03-27, 2000-03-25, 2000-03-01, 2020-03-10, NA, NA), stringsAsFactors = F)

我要创建的新列名为"hospitalized_due_to_covid">. 以住院时间("date_of_hospitalization")("date_of_covid_test"),检查后1个月

如果存在NA,那么结果将是FALSE

我在这里发布的示例的结果是:

hospitalized_due_to_covid = c(TRUE, TRUE, FALSE, FALSE, FALSE, FALSE)

我怎么写这个?

非常感谢你提前!!:)

您可以尝试:

library(lubridate)
library(dplyr)
df %>%
mutate(across(c(date_of_covid_test, date_of_hospitalization), as.Date), 
hospitalized_due_to_covid = date_of_hospitalization >= (date_of_covid_test - 7) & 
date_of_hospitalization <= (date_of_covid_test %m+% months(1)), 
hospitalized_due_to_covid = replace(hospitalized_due_to_covid, is.na(hospitalized_due_to_covid), FALSE))
#  PatientID date_of_covid_test date_of_hospitalization hospitalized_due_to_covid
#1      3454         2020-04-02              2020-03-27                      TRUE
#2       345         2000-03-01              2000-03-25                      TRUE
#3         5         2000-01-01              2000-03-01                     FALSE
#4       345         2020-11-03              2020-03-10                     FALSE
#5       567         2020-04-02                    <NA>                     FALSE
#6        79         2020-12-05                    <NA>                     FALSE

你的数据像:

df <- data.frame(PatientID = c("3454","345","5","345","567","79"),
date_of_covid_test = c("2020-04-02", "2000-03-01", "2000-01-01", "2020-11-03", "2020-04-02", "2020-12-05"),
date_of_hospitalization = c("2020-03-27", "2000-03-25", "2000-03-01", "2020-03-10", NA, NA), stringsAsFactors = F)

相关内容

  • 没有找到相关文章

最新更新