我有一个包含事务的数据框架。我想对数据进行分组。
buy_date category subcategory product actual_price sell_price
1/1/2021 Cloth women shirt style A 5 4
1/1/2021 Cloth men skirt style A 7 6.5
1/1/2021 Accessories ear sky wing 2 1
2/1/2021 Automotive wheel small 21 18
2/1/2021 Automotive wheel big 34 30
1/14/2021 Accessories ring queen couple 3 3
1/17/2021 Cloth women shirt style B 7 7
1/17/2021 Cloth men skirt style A 7 6.5
4/2/2021 Cloth men skirt style A 10 9
5/2/2021 Accessories ring queen couple 3 2.5
7/2/2021 Cloth women shirt style B 16 12
7/2/2021 Automotive wheel big 40 35
2/26/2021 Accessories ring queen couple 4 4
2/26/2021 Cloth women shirt style B 9 5
2/26/2021 Cloth men skirt style A 7 9
2/28/2021 Accessories ear sky wing 2 1
1/3/2021 Automotive wheel big 38 35
1/3/2021 Accessories ring queen couple 4 4
7/3/2021 Automotive wheel big 39 37
3/31/2021 Accessories ring queen couple 4 4
我想要得到每个类别和子类别的月平均销售量和实际价格。我试过很多方法,但都不太管用。由于
使用Grouper
和聚合mean
:
df['buyDate'] = pd.to_datetime(df["buyDate"])
df.groupby([pd.Grouper(freq='M', key='buyDate'),'category', 'subcategory']).mean()
Just do:
>>> df.groupby(['category', 'subcategory']).mean()
actual_price sell_price
category subcategory
Accessories ear 2.00 1.00
ring 3.60 3.50
Automotive wheel 34.40 31.00
Cloth men 7.75 7.75
women 9.25 7.00
>>>
您可以找到month
然后groupby
有三个所需的列。
试试这个:
df['month'] = pd.to_datetime(df["buyDate"]).dt.month
df.groupby(['month','category', 'subcategory']).mean()