我是 Python 的新手,并尝试计算两个不同日期的单位列中两个数字之间的百分比差异,并将其结果另存为新列 (My_Calculation_Result( 中的值。此值应仅出现在具有最新日期的行上。
((单位[日期为 2020-02-01] - 单位[日期为 2020-01-25] (/单位[日期为 2020-01-25]( * 100%
我的初始 CSV 文件结构:
Date, ID, Name, Units,
2020-02-01, 123, Guitar, 200,
2020-02-01, 456, Drums, 150,
2020-02-01, 789, Piano, 340,
2020-01-25, 123, Guitar, 980,
2020-01-25, 456, Drums, 3,
2020-01-25, 789, Piano, 300,
所需的 CSV 输出: 在输出文件中,我只需要将计算结果添加到具有最新日期的行中。
Date, ID, Name, Units, My_Calculation_Result
2020-02-01, 123, Guitar, 200, -79.59%
2020-02-01, 456, Drums, 150, 49.00%
2020-02-01, 789, Piano, 340, 11.76%
2020-01-25, 123, Guitar, 980,
2020-01-25, 456, Drums, 3,
2020-01-25, 789, Piano, 300,
提前感谢您对此的任何帮助!
IIUC:
df['My_Cal_Result'] = df.groupby(['ID']).Units.pct_change(-1)
输出:
Date ID Name Units My_Cal_Result
0 2020-02-01 123 Guitar 200 -0.795918
1 2020-02-01 456 Drums 150 49.000000
2 2020-02-01 789 Piano 340 0.133333
3 2020-01-25 123 Guitar 980 NaN
4 2020-01-25 456 Drums 3 NaN
5 2020-01-25 789 Piano 300 NaN