我用这种方式构建了S&P101的数据:
Symbol Name X2020.01.02 X2020.01.03 X2020.01.06 X2020.01.07 X2020.01.08 X2020.01.09
1 AAPL Apple 75.0875 74.3575 74.95 74.5975 75.7975 77.4075
2 ABBV AbbVie 89.5500 88.7000 89.40 88.8900 89.5200 90.2100
现在,我把它变成长格式,因为我使用混合模型:
#convert data to long format for mixed model
c_data = ncol(data_2020)
#convert the dates into numbers
names(data_2020)[3:c_data]<- 1:(c_data-2)
tempDataLong <- data_2020 %>% gather( key = day, value= close, 3:c_data, factor_key = TRUE )
#convert data to numeric for analysis
tempDataLong$day<- as.numeric(tempDataLong$day)
当我尝试使用函数as时。日期将数据转换为日期,它不接受它,因为它现在是一个因子,当我先将其改为数字,然后将其更改为无关日期(即1970)
请注意,这些日期现在不是连续的,因为股票市场不是一周工作三天,但为了我的分析目的,我被允许这样使用它们。
我的问题是——如何将长格式的数据转换回宽格式的日期?
Here is how my long format looks like right now:
Symbol Name day close
1 AAPL Apple 1 75.08750
2 ABBV AbbVie 1 89.55000
3 ABT Abbott Laboratories 1 86.95000
4 ACN Accenture 1 210.14999
5 ADBE Adobe 1 334.42999
6 AIG American International Group 1 51.76000
7 AMGN Amgen 1 240.10001
8 AMT American Tower 1 228.50000
如果更改列名,将丢失日期信息。
在gather
已被取代的情况下,尝试使用pivot_longer
。
library(dplyr)
library(tidyr)
tempDataLong <- data_2020 %>%
pivot_longer(cols = starts_with('X'),
names_to = 'day',
names_pattern = 'X(.*)') %>%
mutate(day = lubridate::ymd(day))
tempDataLong
# Symbol Name day value
# <chr> <chr> <date> <dbl>
# 1 AAPL Apple 2020-01-02 75.1
# 2 AAPL Apple 2020-01-03 74.4
# 3 AAPL Apple 2020-01-06 75.0
# 4 AAPL Apple 2020-01-07 74.6
# 5 AAPL Apple 2020-01-08 75.8
# 6 AAPL Apple 2020-01-09 77.4
# 7 ABBV AbbVie 2020-01-02 89.6
# 8 ABBV AbbVie 2020-01-03 88.7
# 9 ABBV AbbVie 2020-01-06 89.4
#10 ABBV AbbVie 2020-01-07 88.9
#11 ABBV AbbVie 2020-01-08 89.5
#12 ABBV AbbVie 2020-01-09 90.2
data_2020 <- structure(list(Symbol = c("AAPL", "ABBV"), Name = c("Apple",
"AbbVie"), X2020.01.02 = c(75.0875, 89.55), X2020.01.03 = c(74.3575,
88.7), X2020.01.06 = c(74.95, 89.4), X2020.01.07 = c(74.5975,
88.89), X2020.01.08 = c(75.7975, 89.52), X2020.01.09 = c(77.4075,
90.21)), class = "data.frame", row.names = c("1", "2"))