我想根据我的日期列扩展我的数据框架,以便按时间顺序排列我当前日期之间有新的日期。我的日期专栏是按时间顺序排列的,持续了5年,并且包含我想忽略的重复日期。我希望相应的组并绘制新行的排为" na"。
zz <- "Date Group Draw
1 2006-05-11 bb T
2 2006-05-11 bb F
3 2006-05-14 aa T
4 2006-05-16 aa T
5 2006-05-20 cc F
6 2006-05-20 bb F
7 2006-05-21 aa T"
Data <- read.table(text=zz, header = TRUE)
所以我希望我的新数据框架看起来像:
xx <- "Date Group Draw
1 2006-05-11 bb T
2 2006-05-11 bb F
3 2006-05-12 NA NA
4 2006-05-13 NA NA
5 2006-05-14 aa T
6 2006-05-15 NA NA
7 2006-05-16 aa T
8 2006-05-17 NA NA
9 2006-05-18 NA NA
10 2006-05-19 NA NA
11 2006-05-20 cc F
12 2006-05-20 bb F
13 2006-05-21 aa T"
Output <- read.table(text=xx, header = TRUE)
任何帮助将不胜感激。我是R的新手,并且一直在尝试手动执行此操作。
我认为这应该很好:
merge(
x = data.frame(
Date = seq.Date(min(df$Date), max(df$Date), by = "day")
),
y = df,
all.x = TRUE
)
# Date Group Draw
# 1 2006-05-11 bb TRUE
# 2 2006-05-11 bb FALSE
# 3 2006-05-12 <NA> NA
# 4 2006-05-13 <NA> NA
# 5 2006-05-14 aa TRUE
# 6 2006-05-15 <NA> NA
# 7 2006-05-16 aa TRUE
# 8 2006-05-17 <NA> NA
# 9 2006-05-18 <NA> NA
# 10 2006-05-19 <NA> NA
# 11 2006-05-20 cc FALSE
# 12 2006-05-20 bb FALSE
# 13 2006-05-21 aa TRUE
所有这些都是创建一个日期序列,涵盖了实际数据范围,然后执行左联接。
和同一想法,使用data.table
:
dt[dt[,.(Date = seq.Date(min(Date), max(Date), by = "day"))], on = .(Date)]
# Date Group Draw
# 1: 2006-05-11 bb TRUE
# 2: 2006-05-11 bb FALSE
# 3: 2006-05-12 NA NA
# 4: 2006-05-13 NA NA
# 5: 2006-05-14 aa TRUE
# 6: 2006-05-15 NA NA
# 7: 2006-05-16 aa TRUE
# 8: 2006-05-17 NA NA
# 9: 2006-05-18 NA NA
# 10: 2006-05-19 NA NA
# 11: 2006-05-20 cc FALSE
# 12: 2006-05-20 bb FALSE
# 13: 2006-05-21 aa TRUE
zz <- "Date Group Draw
1 2006-05-11 bb T
2 2006-05-11 bb F
3 2006-05-14 aa T
4 2006-05-16 aa T
5 2006-05-20 cc F
6 2006-05-20 bb F
7 2006-05-21 aa T"
df <- read.table(
text = zz,
header = TRUE
)
df$Date <- as.Date(df$Date)
library(data.table)
dt <- data.table(read.table(text = zz, header = TRUE))[,Date := as.Date(Date)]
使用 @nrussell帖子中的数据,另一个选项是 complete
tidyr
library(tidyr)
complete(df, Date = full_seq(Date, 1))
## A tibble: 13 × 3
# Date Group Draw
# <date> <fctr> <lgl>
#1 2006-05-11 bb TRUE
#2 2006-05-11 bb FALSE
#3 2006-05-12 NA NA
#4 2006-05-13 NA NA
#5 2006-05-14 aa TRUE
#6 2006-05-15 NA NA
#7 2006-05-16 aa TRUE
#8 2006-05-17 NA NA
#9 2006-05-18 NA NA
#10 2006-05-19 NA NA
#11 2006-05-20 cc FALSE
#12 2006-05-20 bb FALSE
#13 2006-05-21 aa TRUE
如果我正确理解您的问题,这是我的粗略示例:
date <- format(seq.Date(from=as.Date(paste(2006, '05', '11', sep='-'),
'%Y-%m-%d'),
to =as.Date(paste(2006, 05, '21', sep='-'),
'%Y-%m-%d'),
by = "day"), '%Y-%m-%d')
以上生成日期列表。然后,您可以将上面的date
的左连接到您的数据。Table。