我有熊猫df
。Dataframe包含各种产品的信息,以及该产品在特定月份是否有销售
product 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00 2021-07-01 00:00:00 2021-08-01 00:00:00 2021-09-01 00:00:00 2021-10-01 00:00:00 2021-11-01 00:00:00 2021-12-01 00:00:00 2022-01-01 00:00:00 2022-02-01 00:00:00 2022-03-01 00:00:00 2022-04-01 00:00:00 2022-05-01 00:00:00 2022-06-01 00:00:00 2022-07-01 00:00:00 2022-08-01 00:00:00
2 C 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0
3 D 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
我想退回产品最后销售日期的清单。例如,对于产品"C",它将是"'2021-10-01'",对于产品"D",它将是"'2022-02-01'">
我运行for循环,但它返回给我val>0的所有日期。(预计)我怎样才能调整循环,使它只返回最后一个日期?
for col in df.iloc[:,1:].columns:
for val in df[col]:
if val>0:
print(col)
不要循环。用stack
重塑并用groupby.max
获得每个售出产品的最大日期:
tmp = df.set_index('product').rename_axis(columns='date')
out = (tmp[tmp.eq(1)].stack().reset_index('date')
.groupby('product')['date'].max()
)
输出:
product
C 2021-09-01 00:00:00
D 2022-01-01 00:00:00
Name: date, dtype: object