我如何在数据框架中为groupby做Python



我有一个包含事务的数据框架。我想对数据进行分组。

buy_date            category    subcategory product         actual_price    sell_price
1/1/2021            Cloth       women       shirt style A   5               4
1/1/2021            Cloth       men         skirt style A   7               6.5
1/1/2021            Accessories ear         sky wing        2               1
2/1/2021            Automotive  wheel       small           21              18
2/1/2021            Automotive  wheel       big             34              30
1/14/2021           Accessories ring        queen couple    3               3
1/17/2021           Cloth       women       shirt style B   7               7
1/17/2021           Cloth       men         skirt style A   7               6.5
4/2/2021            Cloth       men         skirt style A   10              9
5/2/2021            Accessories ring        queen couple    3               2.5
7/2/2021            Cloth       women       shirt style B   16              12
7/2/2021            Automotive  wheel       big             40              35
2/26/2021           Accessories ring        queen couple    4               4
2/26/2021           Cloth       women       shirt style B   9               5
2/26/2021           Cloth       men         skirt style A   7               9
2/28/2021           Accessories ear         sky wing        2               1
1/3/2021            Automotive  wheel       big             38              35
1/3/2021            Accessories ring        queen couple    4               4
7/3/2021            Automotive  wheel       big             39              37
3/31/2021           Accessories ring        queen couple    4               4

我想要得到每个类别和子类别的月平均销售量和实际价格。我试过很多方法,但都不太管用。由于

使用Grouper和聚合mean:

df['buyDate'] = pd.to_datetime(df["buyDate"])
df.groupby([pd.Grouper(freq='M', key='buyDate'),'category', 'subcategory']).mean()

Just do:

>>> df.groupby(['category', 'subcategory']).mean()
actual_price  sell_price
category    subcategory                          
Accessories ear                  2.00        1.00
ring                 3.60        3.50
Automotive  wheel               34.40       31.00
Cloth       men                  7.75        7.75
women                9.25        7.00
>>> 

您可以找到month然后groupby有三个所需的列。

试试这个:

df['month'] = pd.to_datetime(df["buyDate"]).dt.month
df.groupby(['month','category', 'subcategory']).mean()

最新更新