我有以下数据框,代表受试者的ID、随访前的月数和受试者的年龄。
df1<-structure(list(USUBJID = c(1, 2, 3),
follow_up = c(24,36,56),
AGE = c(65,34,65)),
row.names = c(NA, -3L),
class = c("tbl_df", "tbl", "data.frame"))
# A tibble: 3 x 3
USUBJID follow_up AGE
<dbl> <dbl> <dbl>
1 1 24 65
2 2 36 34
3 3 56 65
对于每个主题,我需要根据随访列的值创建年度条目(例如,如果随访为36个月,我需要0、12、24和36个月的条目)。对于每个条目,我还需要计算受试者的年龄,并将其与原始年龄列的值相加。
这是我期望的输出:
# A tibble: 12 x 3
USUBJID Month AGE
<dbl> <dbl> <dbl>
1 1 0 65
2 1 12 66
3 1 24 67
4 2 0 34
5 2 12 35
6 2 24 36
7 2 36 37
8 3 0 65
9 3 12 66
10 3 24 67
11 3 36 68
12 3 48 69
条件不是很清楚。这可能会有所帮助-根据'follow_up'列中的值复制行(uncount
从tidyr
),按'USUBJID'分组,创建从0开始递增12的seq
序列和递增1的'AGE'(使用row_number
作为序列)
library(dplyr)
library(tidyr)
df2 <- df1 %>%
uncount(follow_up %/% 12 + 1) %>%
group_by(USUBJID) %>%
mutate(follow_up = seq(0, length.out = n(), by = 12),
AGE = first(AGE) + row_number() - 1) %>%
ungroup %>%
rename(Month = follow_up)
与产出
df2
# A tibble: 12 × 3
USUBJID Month AGE
<int> <dbl> <dbl>
1 1 0 65
2 1 12 66
3 1 24 67
4 2 0 34
5 2 12 35
6 2 24 36
7 2 36 37
8 3 0 65
9 3 12 66
10 3 24 67
11 3 36 68
12 3 48 69
或者使用data.table
library(data.table)
setDT(df1)[rep(seq_len(.N), follow_up %/% 12 + 1)][,
.(Month = seq(0, length.out = .N, by = 12),
AGE = first(AGE) + seq_len(.N) - 1), .(USUBJID)]
与产出
USUBJID Month AGE
<num> <num> <num>
1: 1 0 65
2: 1 12 66
3: 1 24 67
4: 2 0 34
5: 2 12 35
6: 2 24 36
7: 2 36 37
8: 3 0 65
9: 3 12 66
10: 3 24 67
11: 3 36 68
12: 3 48 69
数据df1 <- structure(list(USUBJID = 1:3, follow_up = c(24, 36, 56), AGE = c(65,
34, 65)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-3L))