我试图在R中运行一些不同的分析,在那里我想对我的完整数据样本进行分析,并制作两个子样本。
我的数据范围是1975-12 - 2019-12
是否有任何方法,我可以在日期上运行以下代码,范围仅从1975-12 - 1995-12?
OBS_return_Equal <- FF5_class %>%
group_by(date, Hold) %>%
summarize(Ret_PF = mean(ret,na.rm = TRUE)) %>%
spread(Hold, Ret_PF)
我的数据的一个片段可能看起来像这样:
Date Return Hold
1975-12 4% Big.Value
1976-01 10% Big.Neutral
1976-02 7% Big.Value
1976-03 2% Small.Growth
1976-04 5% Small.Value
1976-05 0% Small.Neutral
1976-06 4% Small.Value
1976-07 2% Small.Growth
1976-08 4% Small.Neutral
1976-09 9% Small.Growth
1976-10 6% Big.Neutral
1976-11 1% Big.Growth
1976-12 0% Big.Neutral
1977-01 5% Big.Value
1977-02 0% Small.Neutral
1977-03 0% Small.Growth
1977-04 6% Small.Neutral
1977-05 2% Small.Value
1977-06 5% Small.Value
1977-07 3% Big.Growth
1977-08 7% Small.Neutral
1977-09 10% Big.Growth
1977-10 10% Big.Growth
1977-11 9% Small.Value
1977-12 2% Small.Growth
1978-01 8% Small.Growth
1978-02 0% Small.Growth
1978-03 0% Big.Growth
1978-04 8% Big.Growth
1978-05 10% Small.Value
1978-06 4% Big.Value
1978-07 9% Small.Value
1978-08 3% Big.Growth
1978-09 6% Big.Neutral
1978-10 0% Big.Value
1978-11 9% Small.Value
1978-12 3% Small.Neutral
1979-01 7% Small.Neutral
1979-02 9% Small.Neutral
1979-03 10% Big.Neutral
1979-04 9% Small.Growth
将日期转换为' date '类型,并仅为您感兴趣的日期过滤数据。
library(dplyr)
OBS_return_Equal <- FF5_class %>%
mutate(Date = as.Date(paste(Date, 1, sep = '-'))) %>%
filter(between(Date, as.Date('1975-12-31'), as.Date('1995-12-31'))) %>%
group_by(Date, Hold) %>%
summarize(Ret_PF = mean(readr::parse_number(Return),na.rm = TRUE)) %>%
tidyr::pivot_wider(names_from = Hold, values_from = Ret_PF) %>%
ungroup
FF5_Class <- structure(list(Date = c("1975-12", "1976-01", "1976-02", "1976-03",
"1976-04", "1976-05", "1976-06", "1976-07", "1976-08", "1976-09",
"1976-10", "1976-11", "1976-12", "1977-01", "1977-02", "1977-03",
"1977-04", "1977-05", "1977-06", "1977-07", "1977-08", "1977-09",
"1977-10", "1977-11", "1977-12", "1978-01", "1978-02", "1978-03",
"1978-04", "1978-05", "1978-06", "1978-07", "1978-08", "1978-09",
"1978-10", "1978-11", "1978-12", "1979-01", "1979-02", "1979-03",
"1979-04"), Return = c("4%", "10%", "7%", "2%", "5%", "0%", "4%",
"2%", "4%", "9%", "6%", "1%", "0%", "5%", "0%", "0%", "6%", "2%",
"5%", "3%", "7%", "10%", "10%", "9%", "2%", "8%", "0%", "0%",
"8%", "10%", "4%", "9%", "3%", "6%", "0%", "9%", "3%", "7%",
"9%", "10%", "9%"), Hold = c("Big.Value", "Big.Neutral", "Big.Value",
"Small.Growth", "Small.Value", "Small.Neutral", "Small.Value",
"Small.Growth", "Small.Neutral", "Small.Growth", "Big.Neutral",
"Big.Growth", "Big.Neutral", "Big.Value", "Small.Neutral", "Small.Growth",
"Small.Neutral", "Small.Value", "Small.Value", "Big.Growth",
"Small.Neutral", "Big.Growth", "Big.Growth", "Small.Value", "Small.Growth",
"Small.Growth", "Small.Growth", "Big.Growth", "Big.Growth", "Small.Value",
"Big.Value", "Small.Value", "Big.Growth", "Big.Neutral", "Big.Value",
"Small.Value", "Small.Neutral", "Small.Neutral", "Small.Neutral",
"Big.Neutral", "Small.Growth")), class = "data.frame", row.names = c(NA, -41L))