下面是输入数据
Name A B C D E F G Total
Ray 1 2 2 0 0 0 0 5
Tom 0 0 0 2 1 0 0 3
Sam 0 0 0 0 0 3 1 4
以下是预期输出
Name A B C D E F G Total A:B:C D:E F:G
Ray 1 2 2 0 0 0 0 5 20:40:40 0:0 0:0
Tom 0 0 0 2 1 0 0 3 0:0:0 67:33 0:0
Sam 0 0 0 0 0 3 1 4 0:0:0 0:0 75:25
想法是在列表cols
中创建列组,然后在循环中将所选列除以sum
,替换NaN
s,四舍五入并转换为整数,最后一个连接字符串:
#check columns names
print (df.columns.tolist())
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'Total']
cols = [('A','B','C'), ('D','E'), ('F','G')]
for c in cols:
#for test
print (c)
print(df.loc[:, c])
df[f'{":".join(c)}'] = (df.loc[:, c]
.div(df.loc[:, c].sum(axis=1), axis=0)
.fillna(0)
.mul(100)
.round()
.astype(int)
.astype(str)
.agg(':'.join, axis=1))
print (df)
A B C D E F G Total A:B:C D:E F:G
0 1 2 2 0 0 0 0 5 20:40:40 0:0 0:0
1 0 0 0 2 1 0 0 3 0:0:0 67:33 0:0
2 0 0 0 0 0 3 1 4 0:0:0 0:0 75:25