>我有这个表:
record_id result date_start date_end
1 1 pos
2 1 26/06/2019 28/06/2019
3 1 27/06/2019 29/06/2019
4 1 28/06/2019 30/06/2019
5 1 29/06/2019 01/07/2019
6 2 neg
7 2 01/07/2019 03/07/2019
8 2 02/07/2019 04/07/2019
9 2 03/07/2019 05/07/2019
10 2 04/07/2019 06/07/2019
11 2 05/07/2019 07/07/2019
12 3 pos
13 3 07/07/2019 09/07/2019
14 3 08/07/2019 10/07/2019
我想计算每行的日期差异,没问题。之后,我想要的是分别分析"pos"和"neg"的组。但是当我有日期时,我的数据中没有结果的值。这是从 REDCap 导入的数据,带有重复仪器。 我使用tidyverse,我认为dplyr可以提供帮助,这不是我必须做的pivot_wider吗?我试过了,但没办法...
如果有人可以提供帮助,谢谢...
像这样,例如,计算每组的平均日期差异?
library(tidyverse)
library(lubridate)
df %>%
fill(result, .direction = "down") %>%
filter(!is.na(date_start)) %>%
mutate(date_start = dmy(date_start),
date_end = dmy(date_end)) %>%
group_by(result) %>%
summarise(mean_date_dif = mean(date_end - date_start))
#`summarise()` ungrouping output (override with `.groups` argument)
## A tibble: 2 x 2
# result mean_date_dif
# <chr> <drtn>
#1 neg 2 days
#2 pos 2 days
数据
df <- tibble::tribble(
~record_id, ~result, ~date_start, ~date_end,
1L, "pos", NA, NA,
1L, NA, "26/06/2019", "28/06/2019",
1L, NA, "27/06/2019", "29/06/2019",
1L, NA, "28/06/2019", "30/06/2019",
1L, NA, "29/06/2019", "01/07/2019",
2L, "neg", NA, NA,
2L, NA, "01/07/2019", "03/07/2019",
2L, NA, "02/07/2019", "04/07/2019",
2L, NA, "03/07/2019", "05/07/2019",
2L, NA, "04/07/2019", "06/07/2019",
2L, NA, "05/07/2019", "07/07/2019",
3L, "pos", NA, NA,
3L, NA, "07/07/2019", "09/07/2019",
3L, NA, "08/07/2019", "10/07/2019"
)