我正在尝试构建一个pivot_table
,它包含两个数据帧(df1
和df2
(的值差异以及agg平均值和标准偏差。
包含df1 =
gender year statistics s_values
male 1999 cigarette use 100
male 1999 cellphone use 310
male 1999 internet use 101
male 1999 alcohol use 100
包含df2 =
gender year statistics s_values
female 1999 cigarette use 156
female 1999 cellphone use 198
female 1999 internet use 232
female 1999 alcohol use 243
所需输出为my_pivot_table =
year statistics male female difference mean std
1999 cigarette use 100.0 156.0 56.0 128.0 28.0
1999 cellphone use 310.0 198.0 112.0 254.0 56.0
1999 internet use 101.0 232.0 78.0 166.5 65.5
1999 alcohol use 100.0 243.0 143.0 171.5 50.0
我将df1
和df2
合并到一个名为merged_df
的数据帧中,我不确定这是否是构建所需数据透视表的正确步骤。。。我不知道从这里到哪里male
列包含df1
的s_values
,阴性列包含df2
的s_values
包含merged_df =
gender year statistics s_values
male 1999 cigarette use 100
male 1999 cellphone use 310
male 1999 internet use 101
male 1999 alcohol use 100
female 1999 cigarette use 156
female 1999 cellphone use 198
female 1999 internet use 232
female 1999 alcohol use 243
尝试:
x = pd.concat([df1, df2]).pivot(
index=["year", "statistics"], columns="gender", values="s_values"
)
x["difference"] = (x["male"] - x["female"]).abs()
x["mean"] = x[["male", "female"]].mean(axis=1)
x["std"] = x[["female", "male"]].std(axis=1, ddof=0)
print(x.reset_index().rename_axis("", axis=1))
打印:
year statistics female male difference mean std
0 1999 alcohol use 243 100 143 171.5 71.5
1 1999 cellphone use 198 310 112 254.0 56.0
2 1999 cigarette use 156 100 56 128.0 28.0
3 1999 internet use 232 101 131 166.5 65.5