在下面的示例中,我希望按月份和年份对预订进行分组。我想操纵两件事:1.当我打印变量rawpm时,我只想显示月份和年份2.当我打印变量rawpj时,我只想显示年份。
我的示例.csv文件:
Bookingsnumber;Saleprice;Area;Place;Purchase;Bookingsdate
C12015024;1000;Wildkogel Ski Arena;Bramberg am Wildkogel;800;1-1-2019
C12015250;1000;Les Quatre Vallées;La Tzoumaz;800;3-3-2019
C12025062;1000;Les Quatre Vallées;La Tzoumaz;800;7-1-2020
C12025085;1000;Paradiski - Les Arcs;Vallandry;800;9-1-2020
C12025085;1000;Paradiski - Les Arcs;Vallandry;800;9-3-2020
C12025085;1000;Paradiski - Les Arcs;Vallandry;800;12-3-2020
我的示例代码:
import pandas as pd
df = pd.read_excel (r'example.csv', parse_dates =["Bookingsdate"], index_col ="Bookingsdate", format='%Y')
#Revenue all websites per year
rawpj = df.Saleprice.resample('Y').sum()
print(rawpj)
ndf = pd.read_excel (r'example.csv', parse_dates =["Bookingsdate"], index_col ="Bookingsdate", format='%m/%Y)
#Revenue all website per month
rawpm = ndf.Saleprice.resample('M').sum()
print(rawpm)
期望结果:
Bookingsdate
2019 2000
2020 4000
Freq: A-DEC, Name: Saleprice, dtype: int64
Bookingsdate
2019-01 1000
2019-02 0
2019-03 1000
2019-04 0
2019-05 0
2019-06 0
2019-07 0
2019-08 0
2019-09 0
2019-10 0
2019-11 0
2019-12 0
2020-01 2000
2020-02 0
2020-03 2000
Freq: M, Name: Saleprice, dtype: int64
提前感谢,Jeroen
已解决:
import pandas as pd
from datetime import date
df = pd.read_excel (r'example.csv')
df['Bookingsdate'] = pd.to_datetime(df['Bookingsdate'],format='%d/%m/%Y')
#Revenue per Year
df['year'] = df['Bookingsdate'].map(lambda x: x.strftime('%Y'))
grouped_df = df.groupby('year').sum()
print(grouped_df)
和:
ndf = pd.read_excel (r'example.csv')
ndf['Bookingsdate'] = pd.to_datetime(ndf['Bookingsdate'])
#Revenue per Year/Month
ndf['ym'] = ndf['Bookingsdate'].map(lambda x: x.strftime('%m/%Y'))
grouped_ndf = ndf.groupby('ym').sum()
print(grouped_ndf)