我有一个包含两个日期列(从和到(的数据框。我想在两个日期之间创建一个年份和季度序列,然后重新调整以转换为高数据框。
df <- structure(list(id = 1:4, from = structure(c(1L, 1L, 2L, 2L), .Label = c("2018-01-01", "2018-04-01"), class = "factor"), to = structure(c(2L, 1L, 4L, 3L), .Label = c("2018-02-01", "2018-04-01", "2018-07-01", "2018-10-01"), class = "factor")), class = "data.frame", row.names = c(NA, -4L))
> df
id from to
1 2018-01-01 2018-04-01
2 2018-01-01 2018-02-01
3 2018-04-01 2018-10-01
4 2018-04-01 2018-07-01
在上面的示例中,id 1 在 Q1 和 Q2 中,而 id 2 仅在 Q1 中。
期望输出:
#id Quarter
#1 Q1 2018
#1 Q2 2018
#2 Q1 2018
#3 Q2 2018
#3 Q3 2018
#3 Q4 2018
#4 Q2 2018
#4 Q3 2018
我能够使用 seq 函数获得四分之一序列:
dsq <- seq(ymd('2018-01-01'),ymd('2018-04-01'), by = 'quarters')
paste0("Q", lapply(dsq, quarter), " ", lapply(dsq, year))
但我正在寻找一种使用获得的序列重塑我的表格的方法。
任何帮助,不胜感激。
我们可以做
library(zoo)
library(tidyverse)
df %>%
mutate_at(2:3, zoo::as.yearqtr, format = "%Y-%m-%d") %>%
gather(key, Quarter, -id) %>%
select(-key) %>%
arrange(id)
# id Quarter
#1 1 2018 Q1
#2 1 2018 Q2
#3 2 2018 Q1
#4 2 2018 Q1
#5 3 2018 Q2
#6 3 2018 Q4
#7 4 2018 Q2
#8 4 2018 Q3
或者尝试data.table
library(data.table)
setDT(df)[, `:=`(from = as.Date(from),
to = as.Date(to))
][, .(Quarter = as.yearqtr(seq(from, to, "quarter"))), by = id]
如果from
和to
已经属于类date
,那么这简化为
setDT(df)[, .(Quarter = as.yearqtr(seq(from, to, "quarter"))), by = id]
你可以用一些额外的代码来做到这一点:
f <- function(x,y) {
dsq <- seq(ymd(x),ymd(y), by = 'quarters')
paste0("Q", lapply(dsq, quarter), " ", lapply(dsq, year))
}
df %>% rowwise %>% mutate(quarter=list(f(from,to))) %>% unnest
## A tibble: 8 x 4
# id from to quarter
# <int> <fct> <fct> <chr>
#1 1 2018-01-01 2018-04-01 Q1 2018
#2 1 2018-01-01 2018-04-01 Q2 2018
#3 2 2018-01-01 2018-02-01 Q1 2018
#4 3 2018-04-01 2018-10-01 Q2 2018
#5 3 2018-04-01 2018-10-01 Q3 2018
#6 3 2018-04-01 2018-10-01 Q4 2018
#7 4 2018-04-01 2018-07-01 Q2 2018
#8 4 2018-04-01 2018-07-01 Q3 2018