我一直在尝试根据不同长度的间隔计算数据帧或列表中顺序项的乘积。本质上,我想从给定不规则间隔大小的Qx值列表中计算nQx。
dComp <- data.frame(AGE = seq(0,74), MORTALITY=c(869,58,40,37,36,35,32,28,29,23,24,22,24,28,
33,52,57,77,93,103,103,109,105,114,108,112,119,125,117,127,125,134,134,131,152,179,173,
182,199,203,232,245,296,315,335,356,405,438,445,535,594,623,693,749,816,915,994,1128,1172,
1294,1473,1544,1721,1967,2129,2331,2559,2901,3203,3470,3782,4348,4714,5245,5646)/100000)
x <- c(0,1,5,10,15,20,25,30,35,40,45,50,55,60,65,70)
n <- c(diff(x),999)
n
[1] 1 4 5 5 5 5 5 5 5 5 5 5 5 5 5 999
对于1个项目,我能够找到值:
第一次计算Px:
Px <- sapply(dComp$MORTALITY, function(Qx) (1 - Qx))
对于x=[1,4]的间隔
1- prod(Px[2:5])
如何在整个间隔列表中实现这一点。在vba中,我会使用for循环,但我知道在R中使用了应用程序。PS:有人能推荐一本好的R说明书吗?
您可以组合tapply
和cut
:
## no need for sapply in your Px calculation
Px <- 1 - dComp$MORTALITY
## definie intervals
breaks <- c(0,1,5,10,15,20,25,30,35,40,45,50,55,60,65,70, 999)
## using tapply to run the function for each interval (use cut for grouping by AGE)
tapply(X=Px, INDEX=cut(dComp$AGE, breaks=breaks, right=FALSE), FUN=function(x)1-prod(x))
输出:
[0,1) [1,5) [5,10) [10,15) [15,20) [20,25) [25,30) [30,35) [35,40) [40,45) [45,50) [50,55) [55,60) [60,65)
0.008690000 0.001708920 0.001469140 0.001309318 0.003814265 0.005378395 0.005985625 0.006741766 0.009325056 0.014149626 0.021601755 0.034271934 0.053836246 0.085287751
[65,70) [70,999)
0.136549522 0.215953304