我有数据帧,我想计算最近三个月的交易计数和每个组Id的总和



输入数据集

<表类> Id 日期 TransAmt tbody><<tr>2022-01-02102022-01-02202022-02-04302022-02-05202022-04-083002022-04-111002022-05-132002022-06-12202022-06-153002022-08-16100

示例

data = [['A', '2022-01-02', 10], ['A', '2022-01-02', 20], ['A', '2022-02-04', 30], 
['A', '2022-02-05', 20], ['A', '2022-04-08', 300], ['A', '2022-04-11', 100],
['A', '2022-05-13', 200], ['A', '2022-06-12', 20], ['A', '2022-06-15', 300],
['A', '2022-08-16', 100], ['B', '2022-01-02', 10], ['B', '2022-01-02', 20],
['B', '2022-02-04', 30], ['B', '2022-02-05', 20], ['B', '2022-04-08', 300],
['B', '2022-04-11', 100], ['B', '2022-05-13', 200], ['B', '2022-06-12', 20],
['B', '2022-06-15', 300], ['B', '2022-08-16', 100]]
df1 = pd.DataFrame(data, columns=['Id', 'Date', 'TransAmt'])

df1

Id  Date    TransAmt
0   A   2022-01-02  10
1   A   2022-01-02  20
2   A   2022-02-04  30
3   A   2022-02-05  20
4   A   2022-04-08  300
5   A   2022-04-11  100
6   A   2022-05-13  200
7   A   2022-06-12  20
8   A   2022-06-15  300
9   A   2022-08-16  100
10  B   2022-01-02  10
11  B   2022-01-02  20
12  B   2022-02-04  30
13  B   2022-02-05  20
14  B   2022-04-08  300
15  B   2022-04-11  100
16  B   2022-05-13  200
17  B   2022-06-12  20
18  B   2022-06-15  300
19  B   2022-08-16  100
代码

s = df1['Date']
df1['Date'] = df1['Date'].astype('Period[M]') 
df2 = df1.groupby(['Id', 'Date'])['TransAmt'].agg(['count', sum])
idx1 = pd.period_range(df1['Date'].min(), df1['Date'].max(), freq='M')
idx2 = pd.MultiIndex.from_product([df1['Id'].unique(), idx1])
cols = ['Id', 'Date', 'CountThreeMonth', 'AmountofThreeMonth']
n = 3
df3 = df2.reindex(idx2, fill_value=0).groupby(level=0).rolling(n, min_periods=1).sum().droplevel(0).reset_index().set_axis(cols, axis=1)
df1.merge(df3, how='left').assign(Date=s)

result(df1.merge(df3, how='left').assign(Date=s))

Id  Date    TransAmt    CountThreeMonth AmountofThreeMonth
0   A   2022-01-02  10      2.0             30.0
1   A   2022-01-02  20      2.0             30.0
2   A   2022-02-04  30      4.0             80.0
3   A   2022-02-05  20      4.0             80.0
4   A   2022-04-08  300     4.0             450.0
5   A   2022-04-11  100     4.0             450.0
6   A   2022-05-13  200     3.0             600.0
7   A   2022-06-12  20      5.0             920.0
8   A   2022-06-15  300     5.0             920.0
9   A   2022-08-16  100     3.0             420.0
10  B   2022-01-02  10      2.0             30.0
11  B   2022-01-02  20      2.0             30.0
12  B   2022-02-04  30      4.0             80.0
13  B   2022-02-05  20      4.0             80.0
14  B   2022-04-08  300     4.0             450.0
15  B   2022-04-11  100     4.0             450.0
16  B   2022-05-13  200     3.0             600.0
17  B   2022-06-12  20      5.0             920.0
18  B   2022-06-15  300     5.0             920.0
19  B   2022-08-16  100     3.0             420.0

对不起,这很难解释

相关内容

  • 没有找到相关文章

最新更新