r语言 - 如何将短数据转换为长数据格式,但保持日期为日期



我用这种方式构建了S&P101的数据:

Symbol   Name    X2020.01.02 X2020.01.03 X2020.01.06 X2020.01.07 X2020.01.08 X2020.01.09 
1   AAPL  Apple     75.0875     74.3575       74.95     74.5975     75.7975     77.4075     
2   ABBV AbbVie     89.5500     88.7000       89.40     88.8900     89.5200     90.2100 

现在,我把它变成长格式,因为我使用混合模型:

#convert data to long format for mixed model 
c_data    = ncol(data_2020)
#convert the dates into numbers
names(data_2020)[3:c_data]<- 1:(c_data-2)
tempDataLong <- data_2020 %>% gather( key = day, value= close, 3:c_data, factor_key = TRUE )
#convert data to numeric for analysis
tempDataLong$day<- as.numeric(tempDataLong$day)

当我尝试使用函数as时。日期将数据转换为日期,它不接受它,因为它现在是一个因子,当我先将其改为数字,然后将其更改为无关日期(即1970)

请注意,这些日期现在不是连续的,因为股票市场不是一周工作三天,但为了我的分析目的,我被允许这样使用它们。

我的问题是——如何将长格式的数据转换回宽格式的日期?

Here is how my long format looks like right now:
Symbol                         Name      day   close
1     AAPL                        Apple   1   75.08750
2     ABBV                       AbbVie   1   89.55000
3      ABT          Abbott Laboratories   1   86.95000
4      ACN                    Accenture   1  210.14999
5     ADBE                        Adobe   1  334.42999
6      AIG American International Group   1   51.76000
7     AMGN                        Amgen   1  240.10001
8      AMT               American Tower   1  228.50000

如果更改列名,将丢失日期信息。

gather已被取代的情况下,尝试使用pivot_longer

library(dplyr)
library(tidyr)
tempDataLong <- data_2020 %>%
pivot_longer(cols = starts_with('X'), 
names_to = 'day', 
names_pattern = 'X(.*)') %>%
mutate(day = lubridate::ymd(day))
tempDataLong
#   Symbol Name   day        value
#   <chr>  <chr>  <date>     <dbl>
# 1 AAPL   Apple  2020-01-02  75.1
# 2 AAPL   Apple  2020-01-03  74.4
# 3 AAPL   Apple  2020-01-06  75.0
# 4 AAPL   Apple  2020-01-07  74.6
# 5 AAPL   Apple  2020-01-08  75.8
# 6 AAPL   Apple  2020-01-09  77.4
# 7 ABBV   AbbVie 2020-01-02  89.6
# 8 ABBV   AbbVie 2020-01-03  88.7
# 9 ABBV   AbbVie 2020-01-06  89.4
#10 ABBV   AbbVie 2020-01-07  88.9
#11 ABBV   AbbVie 2020-01-08  89.5
#12 ABBV   AbbVie 2020-01-09  90.2

data_2020 <- structure(list(Symbol = c("AAPL", "ABBV"), Name = c("Apple", 
"AbbVie"), X2020.01.02 = c(75.0875, 89.55), X2020.01.03 = c(74.3575, 
88.7), X2020.01.06 = c(74.95, 89.4), X2020.01.07 = c(74.5975, 
88.89), X2020.01.08 = c(75.7975, 89.52), X2020.01.09 = c(77.4075, 
90.21)), class = "data.frame", row.names = c("1", "2"))

相关内容

  • 没有找到相关文章

最新更新