拥有DataFrame
:
# comments are the equations that have to be done to calculate the given column
df = pd.DataFrame({
'item_tolerance': [230, 115, 155],
'item_intake': [250,100,100],
'open_items_previous_day': 0, # df.item_intake.shift() + df.open_items_previous_day.shift() - df.items_shipped.shift() + df.items_over_under_sla.shift()
'total_items_to_process': 0, # df.item_intake + df.open_items_previous_day
'sla_relevant': 0, # df.item_tolerance if df.open_items_previous_day + df.item_intake > df.item_tolerance else df.open_items_previous_day + df.item_intake
'items_shipped': [230, 115, 50],
'items_over_under_sla': 0 # df.items_shipped - df.sla_relevant
})
items_over_under_sla0 00
如果您使用不同的操作计算每列,我建议单独获取它们:
df['open_items_previous_day'] = df['item_intake'].shift(fill_value=0) + df['open_items_previous_day'].shift(fill_value=0) - df['items_shipped'].shift(fill_value=0) + df['items_over_under_sla'].shift(fill_value=0)
df['total_items_to_process'] = df['item_intake'] + df['open_items_previous_day']
df = df.assign(sla_relevant=np.where(df['open_items_previous_day'] + df['item_intake'] > df['item_tolerance'], df['item_tolerance'], df['open_items_previous_day'] + df['item_intake']))
df['items_over_under_sla'] = df['items_shipped'] - df['sla_relevant']
df
Out[1]:
item_tolerance item_intake open_items_previous_day total_items_to_process sla_relevant items_shipped items_over_under_sla
0 230 250 0 250 230 230 0
1 115 100 20 120 115 115 0
2 155 100 -15 85 85 50 -35
您所面临的问题不是必须使用前一行(您正在使用shift函数来解决这个问题)。这里真正的问题是,您试图获得的所有列(除了total_items_to_process)相互依赖,因此,如果没有其中一个列(或假设它最初为零),您就无法获得其余的列。
这就是为什么你会得到不同的结果取决于你先计算哪一列。