Pandas fillna带有自定义lamda函数-输出未正确显示

我有两个数据帧。第一个grouper包含按项目划分的每月平均销售数量。大多数商品都有12个月的价值，因为它们的售价>1年。但是已经在出售的物品<1年并非所有月份都有值。示例：grouper[grouper['Product'] == 'IT984359570']

Product       Month   Sales Quantity [QTY]
4190    IT984359570   4       35.0
4191    IT984359570   5       208.0
4192    IT984359570   6       208.0
4193    IT984359570   7       233.0
4194    IT984359570   8       191.0

第二个数据帧是一个数据透视表，它显示按产品的累计销售额总和pivot_table。这也考虑到了新的订单(因此在一些单元格中为正数(。pivot_table[pivot_table['Product'] == 'IT984359570']返回：

Date    Product     2022-05-01  2022-06-01  2022-07-01  2022-08-01  2022-09-01  2022-10-01  2022-11-01
412     IT984359570 -208.0     -416.0       -649.0      -840.0      2019.0      NaN         NaN

我希望避免删除所有具有NaN值的行。我想用grouper中特定产品的所有条目的平均值来填充所有NaN值。对于产品IT984359570：用175填充行412中的所有NaN值，如(35+208+208+233+191(/5=175。

我试过用代码来做这件事

pivot_table = pivot_table.fillna(lambda row: grouper.loc[grouper['Product'] == row['Product'], 'Sales Quantity [QTY]'].mean())

但是，我没有得到想要的输出。我的输出：

Date    Product      2022-05-01  2022-06-01  2022-07-01  2022-08-01  2022-09-01  2022-10-01   2022-11-01    
412     IT984359570  -208.0      -416.0      -649.0      -840.0      2019.0      <function <lambda> at 0x0000023221232320>   <function <lambda> at 0x0000023221232320>

我做错了什么？

编辑：

pivot_table使用.cumsum((，因此所需的输出如下所示：

Date    Product     2022-05-01  2022-06-01  2022-07-01  2022-08-01  2022-09-01  2022-10-01  2022-11-01
412     IT984359570 -208.0     -416.0       -649.0      -840.0      2019.0      1844.0      1669.0

在以上贡献的基础上，我认为添加axis参数将完成代码。希望能有所帮助。

pivot_table.apply(lambda x: x.fillna(grouper.loc[grouper['Product'] == x['Product'], 'Sales Quantity'].mean()), axis=1)

要获得所需的输出：

Date    Product     2022-05-01  2022-06-01  2022-07-01  2022-08-01  2022-09-01  2022-10-01  2022-11-01
412     IT984359570 -208.0     -416.0       -649.0      -840.0      2019.0      1844.0      1669.0

我用过：

pivot_table = (final_output.groupby(['Product', 'Date'])['Quantity'].sum().reset_index()
.pivot_table(index=['Product'], columns='Date', values='Quantity').reset_index()
)

pivot_table = pivot_table.apply(lambda x: x.fillna(grouper.loc[grouper['Product'] == x['Product'], 'Sales Quantity [QTY]'].mean()), axis=1)
pivot_table.loc[:, ~pivot_table.columns.isin(['Product'])] = pivot_table.loc[:, ~pivot_table.columns.isin(['Product'])].cumsum(axis=1)

虽然它有效，但我不确定这是否是最蟒蛇的方式。如果你知道如何用更少的代码实现这一点，请建议。。。

相关内容

最新更新

热门标签：