如何说熊猫的"if elements in the same column are the same, then calculate the average of corresponding valu


origin_destination_country  average_delay_mins
0                       ALBANIA                0.00
1                       ALBANIA               13.68
2                       ALBANIA                0.00
3                       ALBANIA                0.00
4                       ALBANIA               79.50
...                         ...                 ...
6273                        USA                0.00
6274                 UZBEKISTAN               27.32
6275                     ZAMBIA               16.08
6276                   ZIMBABWE             1165.00
6277                   ZIMBABWE              102.97

如何计算每个国家(average_delay_mins(的平均值?我的想法是计算与类似的origin_destination_country名称对应的值,并将它们存储在另一个没有重复国家名称的列表中。

试试这段代码,让我知道它是否有效。

import pandas as pd
df = pd.DataFrame({
'origin_destination_country': ['ALBANIA', 'ALBANIA','ALBANIA', 'USA', 'ZIMBABWE', 'ZIMBABWE'],
'average_delay_mins': [0.00, 13.68,0.00,0.00,1165.00,102.97]
})
#get unique country names
list_of_countries = df['origin_destination_country'].unique()
res = []
for i in range(len(list_of_countries)):
#get series of identical country names
get_series = df[df['origin_destination_country'] == list_of_countries[i]]['average_delay_mins'].tolist()
res.append(sum(get_series) / len(get_series))
print(res)

感谢Naufal_Hilmiaji和Code_Difference我只是设法找到了解决方案,结果是这样的:

import pandas as pd
df = pd.DataFrame({
'origin_destination_country': ['ALBANIA', 'ALBANIA','ALBANIA', 'USA', ...,'ZIMBABWE', 'ZIMBABWE'],
'average_delay_mins': [0.00, 13.68,0.00,0.00,...,1165.00,102.97]
})
data = df.groupby('origin_destination_country').mean()
print(data)

相关内容

最新更新