如何在两个数据帧之间取平均值

我有两个数据帧:

{'id': {4: 1548638, 6: 1953603, 7: 1956216, 8: 1962245, 9: 1981386, 10: 1981773, 11: 2004787, 13: 2017418, 14: 2020989, 15: 2045043}, 'total': {4: 17, 6: 38, 7: 59, 8: 40, 9: 40, 10: 40, 11: 80, 13: 44, 14: 51, 15: 46}}
{'id': {4: 1548638, 6: 1953603, 7: 1956216, 8: 1962245, 9: 1981386, 10: 1981773, 11: 2004787, 13: 2017418, 14: 2020989, 15: 2045043}, 'total': {4: 17, 6: 38, 7: 59, 8: 40, 9: 40, 10: 40, 11: 80, 13: 44, 14: 51, 15: 46}}

对于存在于两个数据框中的每个'id'，我想在'total'中计算它们值的平均值，并将其放在一个新的数据框中。

我试着:

pd.merge(df1, df2, on="id")

希望我能做到:

merged_df[['total']].mean(axis=1)

但是它根本不起作用。

你怎么能这么做?

您可以使用:

df1.merge(df2, on='id').set_index('id').mean(axis=1).reset_index(name='total')

或者，如果您有许多列，使用更通用的方法:

(df1.merge(df2, on='id', suffixes=(None, '_other')).set_index('id')
.rename(columns=lambda x: x.removesuffix('_other')) # requires python 3.9+
.groupby(axis=1, level=0)
.mean().reset_index()
)

输出:

id  total
0  1548638   17.0
1  1953603   38.0
2  1956216   59.0
3  1962245   40.0
4  1981386   40.0
5  1981773   40.0
6  2004787   80.0
7  2017418   44.0
8  2020989   51.0
9  2045043   46.0

你可以这样做:

df1 = pd.DataFrame({'id': {4: 1548638, 6: 1953603, 7: 1956216, 8: 1962245, 9: 1981386, 10: 1981773, 11: 2004787, 13: 2017418, 14: 2020989, 15: 2045043}, 'total': {4: 17, 6: 38, 7: 59, 8: 40, 9: 40, 10: 40, 11: 80, 13: 44, 14: 51, 15: 46}})
df2 = pd.DataFrame({'id': {4: 1548638, 6: 1953603, 7: 1956216, 8: 1962245, 9: 1981386, 10: 1981773, 11: 2004787, 13: 2017418, 14: 2020989, 15: 2045043}, 'total': {4: 17, 6: 38, 7: 59, 8: 40, 9: 40, 10: 40, 11: 80, 13: 44, 14: 51, 15: 46}})
merged_df = df1.merge(df2, on='id')
merged_df['total_mean'] = merged_df.filter(regex='total').mean(axis=1)
print(merged_df)

输出:

id  total_x  total_y  total_mean
0  1548638       17       17        17.0
1  1953603       38       38        38.0
2  1956216       59       59        59.0
3  1962245       40       40        40.0
4  1981386       40       40        40.0
5  1981773       40       40        40.0
6  2004787       80       80        80.0
7  2017418       44       44        44.0
8  2020989       51       51        51.0
9  2045043       46       46        46.0

相关内容

最新更新

热门标签：