在多个列上组合多个规则,以获得不符合预期的组合结果

  • 本文关键字:组合 不符合 结果 规则 python pandas
  • 更新时间 :
  • 英文 :


我有一个像下面这样的数据框架:

PayeeId transactionId   createdAt                          Amount   Max_90D
100AA       60a23a1     2021-07-24 15:02:25.428000+05:30    5000      12000
100AA       60a23b1d    2021-07-24 20:37:04.351000+05:30    6650      12000 
100AA       60b4b69     2021-07-24 15:02:25.428000+05:30    3334      12000
100AA       6098eb81    2021-07-24 23:30:25.428000+05:30    1000      12000

我正在检查以下条件:

1. If any of the transaction amount ('Amount') is less than 5000, then df['Rule_No'] = 0
2. If condition 1 is not satisfied (as is this case) then
2.a - Check which transaction time ('createdAt') is greater than 23:00 hrs and less than 08:00 hrs. If spotted then df['Rule_No'] = 6 
2.b - If 2.a not satisfied find out if any transaction amount is > 1.5 times of Max_90D. 
2.c - If yes then df['Rule_No'] = 6 else df['Rule_No'] = 0

所以最终的数据框看起来是这样的

PayeeId transactionId   createdAt                          Amount   Max_90D   Rule_No
100AA       60a23a1     2021-07-24 15:02:25.428000+05:30    5000      12000     0
100AA       60a23b1d    2021-07-24 20:37:04.351000+05:30    6650      12000     0
100AA       60b4b69     2021-07-24 15:02:25.428000+05:30    3334      12000     0
100AA       6098eb81    2021-07-24 23:30:25.428000+05:30    1000      12000     6

为完成此操作,我使用以下命令:

if df['Amount'].any() < 50:
df['Rule_No'] = 0
else:
df['Rule_No'] = np.where((df['createdAt'].dt.strftime('%H:%M')<'08:00')|
(df['createdAt'].dt.strftime('%H:%M')>'23:00')
|(df['Amount'] > 1.5 * df['90D_Max']),6,0)

但它的行为不像预期的那样。相反,我得到原始数据框df,列Rule_No为所有4行的0。换句话说,它不只是进入else块。

if df['Amount'].any() < 0:
df['Rule_No'] = 0
else:
if df['createdAt'].dt.strftime('%H:%M').any() < '08:00':
df['Rule_No'] = 6
else:
res_df = df
res_df['condition'] = df.apply(lambda x: 1 if x['Amount'] >= 1.5 * x['Max_90D'] else 0)
if res_df['condition'].any() == 1:
df['Rule_No'] = 6
else:
df['Rule_No'] = 0

很抱歉上一个答案,我希望这一个能帮助你;)

最新更新