我有一个pandas数据框架
df = code mapper version country range month value
1 ABC321 ABC Alpha USA High 2021-10 8.0
6 ABC321 ABC Alpha USA High 2021-11 1.0
2 ABC321 PQS Beta IND LOW 2021-10 0.0
3 ABC321 TRR Delta MEX LOW 2021-10 1.0
4 ABC321 TRR Delta MEX LOW 2021-11 3.0
我基于条件屏蔽行并使值为零
mask = (
(df.mapper == 'ABC')
& (df.version == 'Alpha')
& (df.country == 'USA')
& (df.range == 'High')
& (df.month == '2021-10')
)
df.value = df.mask(mask, 0.0).value
使得df为
code mapper version country range month value
1 ABC321 ABC Alpha USA High 2021-10 0.0
6 ABC321 ABC Alpha USA High 2021-11 1.0
2 ABC321 PQS Beta IND LOW 2021-10 0.0
3 ABC321 TRR Delta MEX LOW 2021-10 1.0
4 ABC321 TRR Delta MEX LOW 2021-11 3.0
现在我想添加蒙版值"8"它被更新为"0"给下个月,
expected_output = code mapper version country range month value
1 ABC321 ABC Alpha USA High 2021-10 0.0
6 ABC321 ABC Alpha USA High 2021-11 9.0
2 ABC321 PQS Beta IND LOW 2021-10 0.0
3 ABC321 TRR Delta MEX LOW 2021-10 1.0
4 ABC321 TRR Delta MEX LOW 2021-11 3.0
EDIT
There won't be duplicate rows
一个想法是将值转换为月周期,因此对于匹配前一个月或下一个月,仅使用+ 1
或- 1
:
df['month'] = pd.to_datetime(df['month']).dt.to_period('m')
mask = (
(df.mapper == 'ABC')
& (df.version == 'Alpha')
& (df.country == 'USA')
& (df.range == 'High')
& (df['month'] == '2021-10')
)
mask1 = (
(df.mapper == 'ABC')
& (df.version == 'Alpha')
& (df.country == 'USA')
& (df.range == 'High')
& (df['month'] - 1 == '2021-10')
)
df.loc[mask1, 'value'] += next(iter(df.loc[mask, 'value']), 0)
df.loc[mask, 'value'] = 0
print (df)
code mapper version country range month value
1 ABC321 ABC Alpha USA High 2021-10 0.0
6 ABC321 ABC Alpha USA High 2021-11 9.0
2 ABC321 PQS Beta IND LOW 2021-10 0.0
3 ABC321 TRR Delta MEX LOW 2021-10 1.0
4 ABC321 TRR Delta MEX LOW 2021-11 3.0