列相乘以替换panda数据帧中的原始列



我正试图找到一种方法,在进行乘法运算后将新计算的列带回数据帧。但是,我希望它们替换"201820192020"等原始列下的值。有没有一种方法可以把它和乘法一起做,特别是如果我有一长串要相乘的列的话?

import pandas as pd
df1 = pd.DataFrame({
'ID':  ['a1', 'b1', 'c1'],
'2018': [1, 5, 9], 
'2019': [2, 6, 10], 
'2020': [3, 7, 11]})
df2 = pd.DataFrame({
'ID':  ['a1', 'b1'],
'percentage': [0.6, 0.4]})
df1.filter(regex='2018|2019|2020').multiply(df2["percentage"], axis="index")

Expected:
ID 2018  2019  2020
0  a1  0.6   1.2   1.8
1  b1  2.0   2.4   2.8
2  c1  NaN   NaN   NaN

您可以通过将两个ID列转换为索引,然后处理所有列来对齐索引:

df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")
print(df)
2018  2019  2020
ID                  
a1   0.6   1.2   1.8
b1   2.0   2.4   2.8
c1   NaN   NaN   NaN

df2 = pd.DataFrame({
'ID':  ['a1', 'c1'],
'percentage': [0.6, 0.4]})
df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")
print(df)
2018  2019  2020
ID                  
a1   0.6   1.2   1.8
b1   NaN   NaN   NaN
c1   3.6   4.0   4.4

如果只需要乘以一些列:

cols = ['2018','2019']
df1 = df1.set_index('ID')
df1[cols] = df1[cols].multiply(df2.set_index('ID')["percentage"], axis="index")
print(df1)
2018  2019  2020
ID                  
a1   0.6   1.2     3
b1   NaN   NaN     7
c1   3.6   4.0    11

HI为什么要在答案的最后部分设置索引

因为在相乘之前不设置索引会产生错误的输出:

df2 = pd.DataFrame({
'ID':  ['a1', 'c1'],
'percentage': [0.6, 0.4]})
cols = ['2018', '2019', '2020']
df1[cols] = df1[cols].mul(df2["percentage"], axis=0)
print (df1)
ID  2018  2019  2020
0  a1   0.6   1.2   1.8
1  b1   2.0   2.4   2.8 <- incorrect result (aligned on index 1 not b1)
2  c1   NaN   NaN   NaN <- incorrect result (aligned on index 2 not c1)

最新更新