我有一个非常大的受试者数据集,他们接受了两个方案。下面是一个例子df:
df <- data.frame (Subject = c('875','875','875','875','875','875','875','875',
'875','1392','1392','1392','1392','1392','1392',
'1392','1392','1392','1392','1392'),
StartDate = c('20160915','20160916','20160917','20160918',
'20160926','20160927','20160928','20160929',
'20160930','20160917','20160918','20160919',
'20160920','20160921','20160922','20161005',
'20161006','20161007','20161008','20161009'),
Protocol = c('Training','Training','Training','Training',
'Test','Test','Test','Test','Test','Training',
'Training','Training','Training','Training',
'Training','Test','Test','Test','Test','Test'),
Points = c('14','32','33','50','41','56','41','48','49','34',
'16','25','21','34','28','61','24','45','45','47'),
Attempts = c('24','45','37','69','56','59','53','67','53','55',
'34','36','26','48','44','64','36','62','53','55'))
我想把每个协议的开始日期改为测试日,或者甚至只是在测试日一列中添加。
每个协议都有不规则的日期,所以简单的编号是行不通的。而实际的df有几千行,所以手工计算是一场噩梦。我希望能够比较受试者之间协议的具体日期(即受试者之间培训的第3天等)
我想改变数据帧看起来像:
df2 <- data.frame (Subject = c('875','875','875','875','875','875','875','875',
'875','1392','1392','1392','1392','1392','1392',
'1392','1392','1392','1392','1392'),
StartDate = c('20160915','20160916','20160917','20160918',
'20160926','20160927','20160928','20160929',
'20160930','20160917','20160918','20160919',
'20160920','20160921','20160922','20161005',
'20161006','20161007','20161008','20161009'),
Day = c('1','2','3','4','1','2','3','4','5','1','2','3',
'4','5','6','1','2','3','4','5'),
Protocol = c('Training','Training','Training','Training',
'Test','Test','Test','Test','Test','Training',
'Training','Training','Training','Training',
'Training','Test','Test','Test','Test','Test'),
Points = c('14','32','33','50','41','56','41','48','49','34',
'16','25','21','34','28','61','24','45','45','47'),
Attempts = c('24','45','37','69','56','59','53','67','53','55',
'34','36','26','48','44','64','36','62','53','55'))
有办法做到这一点吗?或者有人有其他想法吗?
每个协议都有不固定的日期是什么意思?
假设你的数据按StartDate
排序。你可以这样做
df %>%
group_by(Subject, Protocol) %>%
mutate(Day = row_number()) %>%
ungroup
# A tibble: 20 × 6
Subject StartDate Protocol Points Attempts Day
<fct> <fct> <fct> <fct> <fct> <int>
1 875 20160915 Training 14 24 1
2 875 20160916 Training 32 45 2
3 875 20160917 Training 33 37 3
4 875 20160918 Training 50 69 4
5 875 20160926 Test 41 56 1
6 875 20160927 Test 56 59 2
7 875 20160928 Test 41 53 3
8 875 20160929 Test 48 67 4
9 875 20160930 Test 49 53 5
10 1392 20160917 Training 34 55 1
11 1392 20160918 Training 16 34 2
12 1392 20160919 Training 25 36 3
13 1392 20160920 Training 21 26 4
14 1392 20160921 Training 34 48 5
15 1392 20160922 Training 28 44 6
16 1392 20161005 Test 61 64 1
17 1392 20161006 Test 24 36 2
18 1392 20161007 Test 45 62 3
19 1392 20161008 Test 45 53 4
20 1392 20161009 Test 47 55 5