对于每一天,我希望看到当天发生的所有事务的最终结果(使用As_Of_Dt作为当天的指标)。例如,在第一天(1/1/22),只有一个交易,因此期望看到As_Of_Dt的一行。第二天(2022年1月2日),有两个交易,所以预计会看到As_Of_Dt的三个交易(来自1/2/22的两个交易,加上来自1/1/22的一个交易),等等,在未来的所有日子里。
下面是一个数据集的例子,我能够使用for循环来做到这一点,但是我的真实数据集要大得多,我正在寻找一个更快的解决方案(可能使用apply
家族?)
df <- data.frame(TradeDate = c("2022-01-01","2022-01-02","2022-01-02", "2022-01-03","2022-01-04"),
Ticker = c("AAPL", "AMZN", "META", "AAPL","TSLA"),
Type = c("Buy", "Buy", "Buy", "Buy","Buy"),
Shares = c(10, 12, 5, 15, 20))
df1 <- df[0, ]
dtStart <- as_date("2022-01-01")
dtEnd <- as_date("2022-01-04")
dtRng <- interval(dtStart, dtEnd) %/% days(1) + 1
for(i in 1:dtRng){
df2 <- df %>%
filter(TradeDate <= (dtStart + i - 1)) %>%
mutate(As_Of_Dt = as_date(dtStart) + i - 1)
df1 <- bind_rows(df1, df2)
}
<表类>TradeDate 股票 型股票As_Of_Dt tbody><<tr>2022-01-01 apple 买 10 2022-01-01 2022-01-01 apple 买 10 2022-01-02 2022-01-02 amazon 买 12 2022-01-02 2022-01-02 元 买 5 2022-01-02 2022-01-01 apple 买 10 2022-01-03 2022-01-02 amazon 买 12 2022-01-03 2022-01-02 元 买 5 2022-01-03 2022-01-03 apple 买 15 2022-01-03 2022-01-01 apple 买 10 2022-01-04 2022-01-02 amazon 买 12 2022-01-04 2022-01-02 元 买 5 2022-01-04 2022-01-03 apple 买 15 2022-01-04 2022-01-04 厂商 买 20 2022-01-04 表类>
我希望这比您的循环更有效,但不如非相等连接有效。
library(dplyr)
df %>%
distinct(as_of = TradeDate) %>%
left_join(df, by = character()) %>%
filter(TradeDate <= as_of)
或者使用dplyr的开发版本,该版本引入了非对等连接
devtools::install_github("tidyverse/dplyr")
df %>%
distinct(as_of = TradeDate) %>%
left_join(df, join_by(as_of >= TradeDate))
结果
as_of TradeDate Ticker Type Shares
1 2022-01-01 2022-01-01 AAPL Buy 10
2 2022-01-02 2022-01-01 AAPL Buy 10
3 2022-01-02 2022-01-02 AMZN Buy 12
4 2022-01-02 2022-01-02 META Buy 5
5 2022-01-03 2022-01-01 AAPL Buy 10
6 2022-01-03 2022-01-02 AMZN Buy 12
7 2022-01-03 2022-01-02 META Buy 5
8 2022-01-03 2022-01-03 AAPL Buy 15
9 2022-01-04 2022-01-01 AAPL Buy 10
10 2022-01-04 2022-01-02 AMZN Buy 12
11 2022-01-04 2022-01-02 META Buy 5
12 2022-01-04 2022-01-03 AAPL Buy 15
13 2022-01-04 2022-01-04 TSLA Buy 20