我想添加另一列,其中包含每个本国的工资列的总和,并插入每个本国的薪资列&工资组合。
这就是我尝试过的
a = df.groupby(['native-country','salary'])[['salary']].count()
a.columns = ['number']
a['total'] = a.sum(level=0, axis=0)
哪个返回
number total
native-country salary
? <=50K 437 NaN
>50K 146 NaN
Cambodia <=50K 12 NaN
>50K 7 NaN
Canada <=50K 82 NaN
... ... ...
United-States >50K 7171 NaN
Vietnam <=50K 62 NaN
>50K 5 NaN
Yugoslavia <=50K 10 NaN
>50K 6 NaN
[82 rows x 2 columns]
根据您的评论,我认为这就是您想要的。但你的问题仍然不清楚,所以也许我的假设是错误的。
a['total'] = a.groupby(level=0)["Number].sum()
试试这个:
a = df.groupby(['native-country','salary'])[['salary']].count()
a.join(a.sum(level=0).rename(columns={'salary':'Total'}))
这是一个MVCE:
import seaborn as sns
import pandas as pd
df_tips = sns.load_dataset('tips')
df_out = df_tips.groupby(['sex','smoker'])[['tip']].sum()
df_out.join(df_out.sum(level=0).rename(columns={'tip':'total'}))
输出:
tip total
sex smoker
Male Yes 183.07 485.07
No 302.00 485.07
Female Yes 96.74 246.51
No 149.77 246.51