计算R中每个ID在活动日期前后的买入和持有回报



我昨天写了一个类似的问题,但随着时间的推移,问题变得更加复杂,所以我想再问一次。

在以下情况下,我有一个关于计算回报的问题。对于每个ID,如果Date=EventDate,我希望计算"事件日期";买入并持有回报;从活动日期前5天到活动日期后5天。

为了更具体地使用下表,我想计算每个ID在事件日期(-5~+5天(前后的回报率,其中回报率为,例如,ID=1时的(9/10-1(、(12/10-1(、(14/10-1(,(17/10-1(和(16/10-1((也就是说,事件日期前六天的价格是回报率计算中的分母。(

+----+------------+-------+------------+
| ID |    Date    | Price | EventDate  |
+----+------------+-------+------------+
|  1 | 2011-03-06 |    10 | NA         |
|  1 | 2011-03-07 |     9 | NA         |
|  1 | 2011-03-08 |    12 | NA         |
|  1 | 2011-03-09 |    14 | NA         |
|  1 | 2011-03-10 |    15 | NA         |
|  1 | 2011-03-11 |    17 | NA         |
|  1 | 2011-03-12 |    12 | 2011-03-12 |
|  1 | 2011-03-13 |    14 | NA         |
|  1 | 2011-03-14 |    17 | NA         |
|  1 | 2011-03-15 |    14 | NA         |
|  1 | 2011-03-16 |    17 | NA         |
|  1 | 2011-03-17 |    16 | NA         |
|  1 | 2011-03-18 |    15 | NA         |
|  1 | 2011-03-19 |    16 | NA         |
|  1 | 2011-03-20 |    17 | NA         |
|  1 | 2011-03-21 |    18 | NA         |
|  1 | 2011-03-22 |    11 | NA         |
|  1 | 2011-03-23 |    15 | NA         |
|  1 | 2011-03-24 |    12 | 2011-03-24 |
|  1 | 2011-03-25 |    13 | NA         |
|  1 | 2011-03-26 |    15 | NA         |
|  2 | 2011-06-11 |    48 | NA         |
|  2 | 2011-06-12 |    49 | NA         |
|  2 | 2011-06-13 |    50 | NA         |
|  2 | 2011-06-14 |    57 | NA         |
|  2 | 2011-06-15 |    60 | NA         |
|  2 | 2011-06-16 |    49 | NA         |
|  2 | 2011-06-17 |    64 | NA         |
|  2 | 2011-06-18 |    63 | NA         |
|  2 | 2011-06-19 |    67 | 2011-06-19 |
|  2 | 2011-06-20 |    70 | NA         |
|  2 | 2011-06-21 |    58 | NA         |
|  2 | 2011-06-22 |    65 | NA         |
|  2 | 2011-06-23 |    57 | NA         |
|  2 | 2011-06-24 |    55 | NA         |
|  2 | 2011-06-25 |    57 | NA         |
|  2 | 2011-06-26 |    60 | NA         |
+----+------------+-------+------------+

最后,我希望在下表中添加一个新的专栏。

+----+------------+-------+------------+---------------+
| ID |    Date    | Price | EventDate  | BuyHoldReturn |
+----+------------+-------+------------+---------------+
|  1 | 2011-03-06 |    10 | NA         | NA            |
|  1 | 2011-03-07 |     9 | NA         | -0.1          |
|  1 | 2011-03-08 |    12 | NA         | 0.2           |
|  1 | 2011-03-09 |    14 | NA         | 0.4           |
|  1 | 2011-03-10 |    15 | NA         | 0.5           |
|  1 | 2011-03-11 |    17 | NA         | 0.7           |
|  1 | 2011-03-12 |    12 | 2011-03-12 | 0.2           |
|  1 | 2011-03-13 |    14 | NA         | 0.4           |
|  1 | 2011-03-14 |    17 | NA         | 0.7           |
|  1 | 2011-03-15 |    14 | NA         | 0.4           |
|  1 | 2011-03-16 |    17 | NA         | 0.7           |
|  1 | 2011-03-17 |    16 | NA         | 0.6           |
|  1 | 2011-03-18 |    15 | NA         | NA            |
|  1 | 2011-03-19 |    16 | NA         | 0.066666667   |
|  1 | 2011-03-20 |    17 | NA         | 0.133333333   |
|  1 | 2011-03-21 |    18 | NA         | 0.2           |
|  1 | 2011-03-22 |    11 | NA         | -0.266666667  |
|  1 | 2011-03-23 |    15 | NA         | 0             |
|  1 | 2011-03-24 |    12 | 2011-03-24 | -0.2          |
|  1 | 2011-03-25 |    13 | NA         | -0.133333333  |
|  1 | 2011-03-26 |    15 | NA         | 0             |
|  2 | 2011-06-11 |    48 | NA         | NA            |
|  2 | 2011-06-12 |    49 | NA         | NA            |
|  2 | 2011-06-13 |    50 | NA         | NA            |
|  2 | 2011-06-14 |    57 | NA         | 0.14          |
|  2 | 2011-06-15 |    60 | NA         | 0.2           |
|  2 | 2011-06-16 |    49 | NA         | -0.02         |
|  2 | 2011-06-17 |    64 | NA         | 0.28          |
|  2 | 2011-06-18 |    63 | NA         | 0.26          |
|  2 | 2011-06-19 |    67 | 2011-06-19 | 0.34          |
|  2 | 2011-06-20 |    70 | NA         | 0.4           |
|  2 | 2011-06-21 |    58 | NA         | 0.16          |
|  2 | 2011-06-22 |    65 | NA         | 0.3           |
|  2 | 2011-06-23 |    57 | NA         | 0.14          |
|  2 | 2011-06-24 |    55 | NA         | 0.1           |
|  2 | 2011-06-25 |    57 | NA         | NA            |
|  2 | 2011-06-26 |    60 | NA         | NA            |
+----+------------+-------+------------+---------------+

我有一个使用下面代码的想法,但不知道如何计算活动日期前后的11次买入和持有回报。

data<-data%>%
group_by(ID)%>%
mutate(BuyHoldReturn=ifelse(Date==EventDate, ....

提前感谢!

您可以按如下方式使用dplyr

# ensure Date and EventDate are Date columns
df <- df %>% mutate(across(c(Date,EventDate), ~as.Date(.x)))

left_join(
select(df,ID, Date), 
inner_join(df %>% select(-EventDate),filter(df,!is.na(EventDate)) %>% distinct(ID, EventDate), by="ID") %>%
filter(abs(EventDate-Date)<=6) %>% 
group_by(ID, EventDate) %>% 
mutate(BuyHoldReturn = c(NA,Price[-1]/Price[1]-1)),
by=c("ID", "Date")
)

输出:

ID       Date Price  EventDate BuyHoldReturn
1   1 2011-03-06    10 2011-03-12            NA
2   1 2011-03-07     9 2011-03-12   -0.10000000
3   1 2011-03-08    12 2011-03-12    0.20000000
4   1 2011-03-09    14 2011-03-12    0.40000000
5   1 2011-03-10    15 2011-03-12    0.50000000
6   1 2011-03-11    17 2011-03-12    0.70000000
7   1 2011-03-12    12 2011-03-12    0.20000000
8   1 2011-03-13    14 2011-03-12    0.40000000
9   1 2011-03-14    17 2011-03-12    0.70000000
10  1 2011-03-15    14 2011-03-12    0.40000000
11  1 2011-03-16    17 2011-03-12    0.70000000
12  1 2011-03-17    16 2011-03-12    0.60000000
13  1 2011-03-18    15 2011-03-12    0.50000000
14  1 2011-03-18    15 2011-03-24            NA
15  1 2011-03-19    16 2011-03-24    0.06666667
16  1 2011-03-20    17 2011-03-24    0.13333333
17  1 2011-03-21    18 2011-03-24    0.20000000
18  1 2011-03-22    11 2011-03-24   -0.26666667
19  1 2011-03-23    15 2011-03-24    0.00000000
20  1 2011-03-24    12 2011-03-24   -0.20000000
21  1 2011-03-25    13 2011-03-24   -0.13333333
22  1 2011-03-26    15 2011-03-24    0.00000000
23  2 2011-06-11    NA       <NA>            NA
24  2 2011-06-12    NA       <NA>            NA
25  2 2011-06-13    50 2011-06-19            NA
26  2 2011-06-14    57 2011-06-19    0.14000000
27  2 2011-06-15    60 2011-06-19    0.20000000
28  2 2011-06-16    49 2011-06-19   -0.02000000
29  2 2011-06-17    64 2011-06-19    0.28000000
30  2 2011-06-18    63 2011-06-19    0.26000000
31  2 2011-06-19    67 2011-06-19    0.34000000
32  2 2011-06-20    70 2011-06-19    0.40000000
33  2 2011-06-21    58 2011-06-19    0.16000000
34  2 2011-06-22    65 2011-06-19    0.30000000
35  2 2011-06-23    57 2011-06-19    0.14000000
36  2 2011-06-24    55 2011-06-19    0.10000000
37  2 2011-06-25    57 2011-06-19    0.14000000
38  2 2011-06-26    NA       <NA>            NA

这可以使用data.table进行类似操作

library(data.table)
setDT(df)
df[,(c("Date", "EventDate")):=lapply(.SD, as.Date), .SDcols=c("Date", "EventDate")]
df[,!c("EventDate")][unique(df[!is.na(EventDate), .(ID, EventDate)]), on="ID", allow.cartesian=T][
abs(EventDate-Date)<=6][,BuyHoldReturn:=c(NA,Price[-1]/Price[1]-1), .(ID,EventDate)][
df[,.(ID,Date)], on=.(ID,Date)
]

以上两种方法都使用笛卡尔连接,然后进行过滤。如果你有一个非常大的表,并且你想提高速度,你也可以使用data.table非等连接来隔离每个IDEventDate(s(之前和之后的行

  1. 加载库并将数据设置为data.table
library(data.table)
setDT(dt)
  1. 创建一个事件帧,它只有ID和events
events = unique(df[!is.na(EventDate),.(ID,EventDate)])
ID  EventDate
<int>     <IDat>
1:     1 2011-03-12
2:     1 2011-03-24
3:     2 2011-06-19
  1. 向框架中添加一些辅助列(这有助于保留非等联接中的列
events[, eDate:=EventDate]
df[, `:=`(s=Date-6, e=Date+6)]
  1. 使用非等联接:要求ID匹配,EventDate介于上面创建的下限和上限之间(即se(
bhr = events[df, on=.(ID, EventDate>=s, EventDate<=e), nomatch=0]
  1. 通过IDEventDate生成BuyHoldReturn
bhr = bhr[, .(Date, BuyHoldReturn=c(NA, Price[-1]/Price[1] -1)), by = .(ID,eDate)]
  1. 合并回原始帧以保留具有NA的行,并选择具有所需名称和排列的列
bhr[df,on=.(ID,Date),.(ID,Date,Price,EventDate=i.EventDate,BuyHoldReturn)]

输入:

structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Date = c("2011-03-06", 
"2011-03-07", "2011-03-08", "2011-03-09", "2011-03-10", "2011-03-11", 
"2011-03-12", "2011-03-13", "2011-03-14", "2011-03-15", "2011-03-16", 
"2011-03-17", "2011-03-18", "2011-03-19", "2011-03-20", "2011-03-21", 
"2011-03-22", "2011-03-23", "2011-03-24", "2011-03-25", "2011-03-26", 
"2011-06-11", "2011-06-12", "2011-06-13", "2011-06-14", "2011-06-15", 
"2011-06-16", "2011-06-17", "2011-06-18", "2011-06-19", "2011-06-20", 
"2011-06-21", "2011-06-22", "2011-06-23", "2011-06-24", "2011-06-25", 
"2011-06-26"), Price = c(10L, 9L, 12L, 14L, 15L, 17L, 12L, 14L, 
17L, 14L, 17L, 16L, 15L, 16L, 17L, 18L, 11L, 15L, 12L, 13L, 15L, 
48L, 49L, 50L, 57L, 60L, 49L, 64L, 63L, 67L, 70L, 58L, 65L, 57L, 
55L, 57L, 60L), EventDate = c(NA, NA, NA, NA, NA, NA, "2011-03-12", 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "2011-03-24", NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, "2011-06-19", NA, NA, NA, 
NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -37L
))

相关内容

  • 没有找到相关文章

最新更新