我正试图找到一种方法,在进行乘法运算后将新计算的列带回数据帧。但是,我希望它们替换"201820192020"等原始列下的值。有没有一种方法可以把它和乘法一起做,特别是如果我有一长串要相乘的列的话?
import pandas as pd
df1 = pd.DataFrame({
'ID': ['a1', 'b1', 'c1'],
'2018': [1, 5, 9],
'2019': [2, 6, 10],
'2020': [3, 7, 11]})
df2 = pd.DataFrame({
'ID': ['a1', 'b1'],
'percentage': [0.6, 0.4]})
df1.filter(regex='2018|2019|2020').multiply(df2["percentage"], axis="index")
Expected:
ID 2018 2019 2020
0 a1 0.6 1.2 1.8
1 b1 2.0 2.4 2.8
2 c1 NaN NaN NaN
您可以通过将两个ID
列转换为索引,然后处理所有列来对齐索引:
df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")
print(df)
2018 2019 2020
ID
a1 0.6 1.2 1.8
b1 2.0 2.4 2.8
c1 NaN NaN NaN
df2 = pd.DataFrame({
'ID': ['a1', 'c1'],
'percentage': [0.6, 0.4]})
df = df1.set_index('ID').multiply(df2.set_index('ID')["percentage"], axis="index")
print(df)
2018 2019 2020
ID
a1 0.6 1.2 1.8
b1 NaN NaN NaN
c1 3.6 4.0 4.4
如果只需要乘以一些列:
cols = ['2018','2019']
df1 = df1.set_index('ID')
df1[cols] = df1[cols].multiply(df2.set_index('ID')["percentage"], axis="index")
print(df1)
2018 2019 2020
ID
a1 0.6 1.2 3
b1 NaN NaN 7
c1 3.6 4.0 11
HI为什么要在答案的最后部分设置索引
因为在相乘之前不设置索引会产生错误的输出:
df2 = pd.DataFrame({
'ID': ['a1', 'c1'],
'percentage': [0.6, 0.4]})
cols = ['2018', '2019', '2020']
df1[cols] = df1[cols].mul(df2["percentage"], axis=0)
print (df1)
ID 2018 2019 2020
0 a1 0.6 1.2 1.8
1 b1 2.0 2.4 2.8 <- incorrect result (aligned on index 1 not b1)
2 c1 NaN NaN NaN <- incorrect result (aligned on index 2 not c1)