单元格占行总数python的百分比



hi我有一个数据集,看起来很像下面的数据框架:

#Table1 :
print("Table1: Current Table")
data = [['ALFA', 35, 47, 67, 44, 193],  
['Bravo', 51, 52, 16, 8, 127], 
['Charlie', 59, 75, 2, 14, 150],  
['Delta', 59, 75, 2, 34, 170],
['Echo', 59, 75, 2, 14, 150],
['Foxtrot', 40, 43, 26, 27, 136], 
['Golf', 35, 31, 22, 13, 101], 
['Hotel', 89, 58, 24, 34, 205]]
df = pd.DataFrame(data, columns= ['Objects', 'Column1', 'Column2', 'Column3', 'Column4', 'Total'])
#df.loc[:,'Total'] = df.sum(axis=1)
print(df)

我想得到所有单元格相对于其行总数的百分比(在"总计"列中计算(,使其看起来如下:

#Table2 :
print('Table2: Expected Outcome')
data2 = [['ALFA',18.1, 24.4, 34.7, 22.8, 193], 
['Bravo',40.2, 40.9, 12.6, 6.3, 127], 
['Charlie',39.3, 50.0, 1.3, 9.3, 150], 
['Delta',34.7, 44.1, 1.2, 20.0, 170],
['Echo',39.3, 50.0, 1.3, 9.3, 150],
['Foxtrot',29.4, 31.6, 19.1, 19.9, 136],
['Hotel',34.7, 30.7, 21.8, 12.9, 101], 
['Golf',43.4, 28.3, 11.7, 16.6, 205]]
df2 = pd.DataFrame(data2, columns= ['Objects', 'Column1', 'Column2', 'Column3', 'Column4', 'Total']) #.round(decimals=1)
#df.loc[:,'Total'] = df.sum(axis=1)
print(df2)

如果总列确实发生了更改、重新计算或在此过程中必须删除,我并不真正感兴趣;但为了完整起见,最好有一个"总计"列和单元格的百分比

通过DataFrame.div:使用不带Objects/Total的所有列的快速矢量化除法

c = df.columns.difference(['Objects','Total'])
df[c] = df[c].div(df['Total'], axis=0).mul(100)
print(df)
Objects    Column1    Column2    Column3    Column4  Total
0     ALFA  18.134715  24.352332  34.715026  22.797927    193
1    Bravo  40.157480  40.944882  12.598425   6.299213    127
2  Charlie  39.333333  50.000000   1.333333   9.333333    150
3    Delta  34.705882  44.117647   1.176471  20.000000    170
4     Echo  39.333333  50.000000   1.333333   9.333333    150
5  Foxtrot  29.411765  31.617647  19.117647  19.852941    136
6     Golf  34.653465  30.693069  21.782178  12.871287    101
7    Hotel  43.414634  28.292683  11.707317  16.585366    205

您可以尝试使用apply:

df[['Column1', 'Column2', 'Column3', 'Column4']] = df[['Column1', 'Column2', 'Column3', 'Column4']].apply(lambda x: x/x.sum(), axis=1)

输出:

Table1: Current Table
Objects   Column1   Column2   Column3   Column4  Total
0     ALFA  0.181347  0.243523  0.347150  0.227979    193
1    Bravo  0.401575  0.409449  0.125984  0.062992    127
2  Charlie  0.393333  0.500000  0.013333  0.093333    150
3    Delta  0.347059  0.441176  0.011765  0.200000    170
4     Echo  0.393333  0.500000  0.013333  0.093333    150
5  Foxtrot  0.294118  0.316176  0.191176  0.198529    136
6     Golf  0.346535  0.306931  0.217822  0.128713    101
7    Hotel  0.434146  0.282927  0.117073  0.165854    205

使用相同的数据创建一个新的数据帧。使用df.column[1:-1]循环遍历数据帧中除最后一列(即Total(之外的所有列,并计算百分比。

df1=pd.DataFrame(data, columns= ['Objects', 'Column1', 'Column2', 'Column3', 'Column4', 'Total'])
for col in df.columns[1:-1]:
df1[col]=(df[col]*100/df.Total)
df1

最新更新