R - 重塑数据并修改新列名



我有一个如下所示的数据集:

data <- data.frame(ID    = rep(1:5,each=4), 
Event = rep(c("SCR","FUP","FUP","FUP"),5), 
Date  = c("2016-11-01", "2016-11-10", "2016-12-01", "2017-01-19", 
"2017-04-12", "2017-04-04", "2017-05-30", "2017-05-25", 
"2018-04-09", "2018-05-02", "2018-05-29", "2018-06-04", 
"2017-06-06", "2017-07-26", "2017-09-07", "2017-09-15", 
"2016-11-01", "2016-11-10", "2016-12-01", "2017-01-19"))

我想以某种方式让它看起来像这样:

ID    SCR         FUP_1        FUP_2        FUP_3
1     2016-11-01  2016-11-10   2016-12-01   2017-01-19
2     2017-04-12  2017-04-04   2017-05-30   2017-05-25
.
.
.

我尝试使用点差,但它给出"错误:重复的标识符"。我也尝试过重塑:

reshape(data, idvar = "ID", timevar = "Event", direction = "wide", sep = "_") 

但它删除了 2 个日期条目,并且只采用第一个后续日期(请参阅下面的输出(

ID   Date_SCR    Date_FUP
1    2016-11-01  2016-11-10
2    2017-03-06  2017-04-12
3    2017-05-25  2017-01-19
4    2018-05-29  2018-06-04
5    2017-07-26  2017-09-07

谁能帮我解决这个问题?提前感谢!

要添加数字,我将使用make.unique.它并不漂亮,但您可以随时重命名它们(或事先修复(。

首先,更改的数据:

data$Event <- ave(as.character(data$Event), data$ID, FUN=make.unique)
head(data)
#     ID Event       Date
# 1.1  1   SCR 2016-11-01
# 1.2  1   FUP 2016-11-10
# 1.3  1 FUP.1 2016-12-01
# 1.4  1 FUP.2 2017-01-19
# 2.5  2   SCR 2017-04-12
# 2.6  2   FUP 2017-04-04

基本R,丑陋的列名,诚然:

reshape(data, idvar = "ID", v.names="Date", timevar="Event", direction="wide")
#      ID   Date.SCR   Date.FUP Date.FUP.1 Date.FUP.2
# 1.1   1 2016-11-01 2016-11-10 2016-12-01 2017-01-19
# 2.5   2 2017-04-12 2017-04-04 2017-05-30 2017-05-25
# 3.9   3 2018-04-09 2018-05-02 2018-05-29 2018-06-04
# 4.13  4 2017-06-06 2017-07-26 2017-09-07 2017-09-15
# 5.17  5 2016-11-01 2016-11-10 2016-12-01 2017-01-19

整洁的诗句

tidyr::spread(data, Event, Date)
#   ID        FUP      FUP.1      FUP.2        SCR
# 1  1 2016-11-10 2016-12-01 2017-01-19 2016-11-01
# 2  2 2017-04-04 2017-05-30 2017-05-25 2017-04-12
# 3  3 2018-05-02 2018-05-29 2018-06-04 2018-04-09
# 4  4 2017-07-26 2017-09-07 2017-09-15 2017-06-06
# 5  5 2016-11-10 2016-12-01 2017-01-19 2016-11-01

data.table

data.table::dcast(data, ID ~ Event)
# Using 'Date' as value column. Use 'value.var' to override
#   ID        FUP      FUP.1      FUP.2        SCR
# 1  1 2016-11-10 2016-12-01 2017-01-19 2016-11-01
# 2  2 2017-04-04 2017-05-30 2017-05-25 2017-04-12
# 3  3 2018-05-02 2018-05-29 2018-06-04 2018-04-09
# 4  4 2017-07-26 2017-09-07 2017-09-15 2017-06-06
# 5  5 2016-11-10 2016-12-01 2017-01-19 2016-11-01

我并不是说这是"最佳"解决方案,但这会自动在事件值的末尾创建那些_num标签。

split(my_data, my_data$ID) %>% 
lapply(function(.id){ group_by(.id, Event) %>% 
mutate(new_event = paste0(Event, "_", row_number())) %>%
ungroup() }) %>%
purrr::reduce(rbind) %>%
dplyr::select(-Event) %>%
as.data.frame() 

最新更新