R 数据帧:日期范围操作 - 子集属于特定日期范围的行



如果你看到我的个人资料,我所有的问题都在数据帧上,这是另一个!

我有一个特定的数据帧,它是借方和贷方交易之间合并的结果

>head(allTxns)
   Cust_no    CreditDate   Credit     DebitDate   Debit
 1  12345     2014-10-01    200      2014-10-03    400
 2  12345     2014-10-01    200      2014-10-04    150
 3  12345     2014-10-01    200      2014-10-15    800     
 4  33344     2014-10-03    500      2014-10-04    50
 5  33344     2014-10-03    500      2014-10-05    504
 6  33344     2014-10-03    500      2014-10-06    332
 7  33344     2014-10-03    500      2014-10-08    56
 8  66554     2014-10-10    660      2014-10-04    150     
 9  66554     2014-10-10    660      2014-10-05    800
10  66554     2014-10-10    660      2014-10-11    400
11  66554     2014-10-10    660      2014-10-12    150
12  66554     2014-10-10    660      2014-10-13    800

我的目标是获取这些行,其中 DebitDate 位于 CreditDate 的 5 天之间,因此我尝试对数据进行子集化,我使用 : 运算符放置日期范围

FiveDays <- allTxns$CreditDate+5 #Results in a vector which has date + 5 days
allTxns <- cbind(allTxns[1:2],FiveDays,allTxns[4:6]) #Adding the vector as a column of dataframe
newDf <- allTxns[allTxns$DebitDate %in% allTxns$CreditDate:allTxns$FiveDays]

在上面的代码中,我收到以下逻辑错误,其中仅使用第一个元素

Warning messages:
  1: In mer32$DepositDate:mer32$FiveDays2 :
      numerical expression has 3994 elements: only the first used
  2: In mer32$DepositDate:mer32$FiveDays2 :
      numerical expression has 3994 elements: only the first used

因此,我所需的输出仅限于前Cust_no (12345(,而不适用于其他行。如何确保范围条件应用于所有行?

输出不正确

 >head(newDf)
 row.names  Cust_no    CreditDate   Credit     DebitDate   Debit
     1      12345     2014-10-01    200      2014-10-03    400
     2      12345     2014-10-01    200      2014-10-04    150
     4      33344     2014-10-03    500      2014-10-04    50
     5      33344     2014-10-03    500      2014-10-05    504
     6      33344     2014-10-03    500      2014-10-06    332
     7      33344     2014-10-03    500      2014-10-08    56
     8      66554     2014-10-10    660      2014-10-04    150     
     9      66554     2014-10-10    660      2014-10-05    800
    10      66554     2014-10-10    660      2014-10-11    400
    11      66554     2014-10-10    660      2014-10-12    150
    12      66554     2014-10-10    660      2014-10-13    800

正确的输出

 >head(newDf)
 row.names  Cust_no    CreditDate   Credit     DebitDate   Debit
     1      12345     2014-10-01    200      2014-10-03    400
     2      12345     2014-10-01    200      2014-10-04    150
     4      33344     2014-10-03    500      2014-10-04    50
     5      33344     2014-10-03    500      2014-10-05    504
     6      33344     2014-10-03    500      2014-10-06    332
     7      33344     2014-10-03    500      2014-10-08    56         
    10      66554     2014-10-10    660      2014-10-11    400
    11      66554     2014-10-10    660      2014-10-12    150
    12      66554     2014-10-10    660      2014-10-13    800

尝试

 allTxns[with(allTxns , CreditDate < DebitDate & DebitDate <=FiveDays),]
 #    Cust_no CreditDate   FiveDays Credit  DebitDate Debit
 #1    12345 2014-10-01 2014-10-06    200 2014-10-03   400
 #2    12345 2014-10-01 2014-10-06    200 2014-10-04   150
 #4    33344 2014-10-03 2014-10-08    500 2014-10-04    50
 #5    33344 2014-10-03 2014-10-08    500 2014-10-05   504
 #6    33344 2014-10-03 2014-10-08    500 2014-10-06   332
 #7    33344 2014-10-03 2014-10-08    500 2014-10-08    56
 #10   66554 2014-10-10 2014-10-15    660 2014-10-11   400
 #11   66554 2014-10-10 2014-10-15    660 2014-10-12   150
 #12   66554 2014-10-10 2014-10-15    660 2014-10-13   800

这个老问题已经有了公认的答案。但是,我注意到问题和答案可以简化,因为没有必要创建额外的FiveDays列:

allTxns[with(allTxns, CreditDate <= DebitDate & DebitDate <= CreditDate + 5L), ]
   Cust_no CreditDate Credit  DebitDate Debit
1    12345 2014-10-01    200 2014-10-03   400
2    12345 2014-10-01    200 2014-10-04   150
4    33344 2014-10-03    500 2014-10-04    50
5    33344 2014-10-03    500 2014-10-05   504
6    33344 2014-10-03    500 2014-10-06   332
7    33344 2014-10-03    500 2014-10-08    56
10   66554 2014-10-10    660 2014-10-11   400
11   66554 2014-10-10    660 2014-10-12   150
12   66554 2014-10-10    660 2014-10-13   800

最新更新