如何使用panda创建"老化"面板



我有以下有效期项目的数据框架:

Item          Expiry Date   Stock
Voucher 1     1-Mar-2022     3
Voucher 2     31-Apr-2022    2
Voucher 3     1-Feb-2022     1

我想创建一个老化的仪表板,并绘制出我在那里的库存数量:

Jan Feb Mar Apr
Voucher 1           3
Voucher 2               2
Voucher 3        1

有什么想法或指导如何做上面的事情吗?我搜索了很多资源,找不到任何资源。我对构建仪表板很陌生。谢谢

您可以从表中提取月份名称(NB。您的日期无效。4月31日是不可能的(和pivot。如果需要,reindex带有月份名称列表:

from calendar import month_abbr
cols = month_abbr[1:] # first item is empty string
(df.assign(month=df['Expiry Date'].str.extract('-(D+)-'))
.pivot(index='Item', columns='month', values='Stock')
.reindex(columns=cols)
)

如果您希望有重复的项目,请使用pivot_tablesum作为聚合函数

输出:

month      Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
Item                                                                 
Voucher 1  NaN  NaN  3.0  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
Voucher 2  NaN  NaN  NaN  2.0  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
Voucher 3  NaN  1.0  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN

您可以这样尝试:

import pandas as pd
# Item          Expiry Date   Stock
# Voucher 1     1-Mar-2022     3
# Voucher 2     31-Apr-2022    2
# Voucher 3     1-Feb-2022     1
data = {'Item': ['Voucher 1', 'Voucher 2', 'Voucher 3'],
'Expiry Date': ['1-Mar-2022', '31-Apr-2022', '1-Feb-2022'],
'Stock': [3, 2, 1]}
df = pd.DataFrame(data)
# Using pandas apply method, get the month from each row using axis=1 and store it in new column 'Month'
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html
df['Month'] = df.apply(lambda x: x['Expiry Date'].split('-')[1], axis=1)
# Using pandas pivot method, set 'Item' column as index,
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot.html
# set unique values in 'Month' column as separate columns
# set values in 'Stock' column as values for respective month columns
# and using 'rename_axis' method, remove the row name 'Month'
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rename_axis.html
new_df = df.pivot(index='Item', columns='Month', values='Stock').rename_axis(None, axis=1)
# Sort the month column names by first converting it to the the pandas timestamp object 
# then using it as a key in a sorted function on all columns
new_df = new_df[sorted(new_df.columns, key=lambda x: pd.to_datetime(x, format='%b'))]
print(new_df)

这就是我得到的输出:

Feb  Mar  Apr
Item                    
Voucher 1  NaN  3.0  NaN
Voucher 2  NaN  NaN  2.0
Voucher 3  1.0  NaN  NaN

相关内容

最新更新