在 R 中扩展时间序列

  • 本文关键字:时间序列 扩展 r
  • 更新时间 :
  • 英文 :


我有以下示例数据:

name <- c("Alpha","Beta")
numerical_ID <- c(1,5)
first_date <- c("2019-01-28","2017-07-16")
last_date <- c("2019-07-19",  "2020-07-14")
interval_calendar_days <- c(30,180)
sample.data <- data.frame(name,numerical_ID,first_date,last_date,interval_calendar_days)

这意味着我有一个从first_date开始的交易,每x个日历日发生一次(其中x = interval_calendar_days(,并在last_date结束。变量名称和numberical_ID是此事务每次出现的特征。

我想创建以下时间序列,但不确定如何创建:

      name    numerical_ID date        
 [1,] "Alpha" "1"          "2019-01-28"
 [2,] "Alpha" "1"          "2019-02-27"
 [3,] "Alpha" "1"          "2019-03-29"
 [4,] "Alpha" "1"          "2019-04-28"
 [5,] "Alpha" "1"          "2019-05-28"
 [6,] "Alpha" "1"          "2019-06-27"
 [7,] "Alpha" "1"          "2019-07-19"
 [8,] "Beta"  "5"          "2017-07-16"
 [9,] "Beta"  "5"          "2018-01-12"
[10,] "Beta"  "5"          "2018-07-11"
[11,] "Beta"  "5"          "2019-01-07"
[12,] "Beta"  "5"          "2019-07-06"
[13,] "Beta"  "5"          "2020-01-02"
[14,] "Beta"  "5"          "2020-06-30"
[15,] "Beta"  "5"          "2020-07-14"

一个选项是首先将"日期"列转换为类Date然后使用 pmap ,创建从"first_date"到"last_date"的日期seq,由"interval_calendar_days"列中的间隔指定,并unnest list输出

library(tidyverse)
library(lubridate)
sample.data %>%
     mutate_at(vars(matches("date")), ymd) %>% 
     transmute(name, numerical_ID, date = pmap(select(., 
           first_date, last_date, interval_calendar_days), ~ 
                  c(seq(..1, ..2, by = ..3), ..2))) %>%
     unnest
# A tibble: 15 x 3
#   name  numerical_ID date      
#   <fct>        <dbl> <date>    
# 1 Alpha            1 2019-01-28
# 2 Alpha            1 2019-02-27
# 3 Alpha            1 2019-03-29
# 4 Alpha            1 2019-04-28
# 5 Alpha            1 2019-05-28
# 6 Alpha            1 2019-06-27
# 7 Alpha            1 2019-07-19
# 8 Beta             5 2017-07-16
# 9 Beta             5 2018-01-12
#10 Beta             5 2018-07-11
#11 Beta             5 2019-01-07
#12 Beta             5 2019-07-06
#13 Beta             5 2020-01-02
#14 Beta             5 2020-06-30
#15 Beta             5 2020-07-14

也可以使用Map base R来完成

lst1 <- do.call(Map, c(f = function(x, y, z) 
     c(seq(as.Date(x), as.Date(y), by = z),
        as.Date(y)), unname(sample.data[-(1:2)])))
out <-  sample.data[1:2][rep(seq_len(nrow(sample.data)), lengths(lst1)),]

最新更新