如何在python中计算字典中列表元素的平均值?



你好,我有一个python字典,看起来像这样:

{{'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]}},
{'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}}}

其中NN3-X为时间序列id, diff和mas为模型名称,_后的数字为模型执行次数。

我想让列表中每个I元素的平均值对应于另一个列表中具有相同模型名称的I元素,例如:1,diffe_1,加上5,从diffe_2,平均值将是3,最终结果将是这样的:

{{'NN3-001': {'diffe':[3,4,5,6], 'mas':[30,40,50,60]}},
{'NN3-002': {'diffe':[16,17,18,19], 'mas':[300,400,500,600]}}}

谢谢。

首先:你的例子不是正确的字典。你在某些地方漏掉了{}

你应该有

{
'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}

可以使用

values = data['NN3-001']['diffe_1'] 

可以计算出mean

mean = sum(values)/len(values)

对于所有列表,您必须使用for-循环与dict.items()

dictionary = {
'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}
for name, values in dictionary.items():
print('=== time serie:', name, '===')
for key, data in values.items():
print('  key:', key)
print(' data:', data)
print(' mean:', sum(data)/len(data))
print('---')

结果:

=== time serie: NN3-001 ===
key: diffe_1
data: [1, 2, 3, 4]
mean: 2.5
---
key: mas_1
data: [10, 20, 30, 40]
mean: 25.0
---
key: diffe_2
data: [5, 6, 7, 8]
mean: 6.5
---
key: mas_2
data: [50, 60, 70, 80]
mean: 65.0
---
=== time serie: NN3-002 ===
key: diffe_1
data: [14, 15, 16, 17]
mean: 15.5
---
key: mas_1
data: [100, 200, 300, 400]
mean: 250.0
---
key: diffe_2
data: [18, 19, 20, 21]
mean: 19.5
---
key: mas_2
data: [500, 600, 700, 800]
mean: 650.0

编辑:

在问题的变化之后,我看到你需要zip(diffe_1, diffe_2)来创建pair。

dictionary = {
'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}
result = {}
for name, values in dictionary.items():
print('=== time serie:', name, '===')

result[name] = {'diff':[], 'mas':[]}

print('--- diffe_1, diffe_2 ---')
for a, b in zip(values['diffe_1'],values['diffe_2']):
mean = int( (a+b)/2 )
print(a, '&', b, '=>', mean)
result[name]['diff'].append(mean)

print('--- mas_1, mas_2 ---')
for a, b in zip(values['mas_1'],values['mas_2']):
mean = int( (a+b)/2 )
print(a, '&', b, '=>', mean)
result[name]['mas'].append(mean)
print(result)      

=== time serie: NN3-001 ===
--- diffe_1, diffe_2 ---
1 & 5 => 3.0
2 & 6 => 4.0
3 & 7 => 5.0
4 & 8 => 6.0
--- mas_1, mas_2 ---
10 & 50 => 30.0
20 & 60 => 40.0
30 & 70 => 50.0
40 & 80 => 60.0
=== time serie: NN3-002 ===
--- diffe_1, diffe_2 ---
14 & 18 => 16.0
15 & 19 => 17.0
16 & 20 => 18.0
17 & 21 => 19.0
--- mas_1, mas_2 ---
100 & 500 => 300.0
200 & 600 => 400.0
300 & 700 => 500.0
400 & 800 => 600.0

{
'NN3-001': {'diff': [3, 4, 5, 6], 'mas': [30, 40, 50, 60]},  
'NN3-002': {'diff': [16, 17, 18, 19], 'mas': [300, 400, 500, 600]}
}

也可以使用循环for prefix in ['diffe', 'mas']:来减少代码。

dictionary = {
'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}
result = {}
for name, values in dictionary.items():
print('=== time serie:', name, '===')


result[name] = {}

for prefix in ['diffe', 'mas']:
print('--- prefix:', prefix, '---')

result[name][prefix] = []
for a, b in zip(values[prefix+'_1'],values[prefix+'_2']):
mean = int( (a+b)/2 )
print(a, '&', b, '=>', mean)
result[name][prefix].append(mean)

print(result)

首先,你的字典是无效的,但这可能是因为你只写了两行。你可能想这样写:

dictionary = {
'NN3-001': {
'diffe_1':[1,2,3,4],
'mas_1':[10,20,30,40],
'diffe_2':[5,6,7,8],
'mas_2':[50,60,70,80],
},
'NN3-002': {
'diffe_1':[14,15,16,17],
'mas_1':[100,200,300,400],
'diffe_2':[18,19,20,21],
'mas_2':[500,600,700,800],
},
}

对于均值计算函数:

def compute_mean(dictionary):
new_dictionary = {}
# Loop on 'NN3-' level
for key, sub_dictionary in dictionary.items():
new_sub_dictionary, accumulated_arrays = {}, {}
# Loop on 'diffe_' level
for sub_key, list in sub_dictionary.items():
# Extract the sub_key without the _n
sub_key = sub_key.split('_')[0]
# If we already encountered this sub_key
if sub_key in new_sub_dictionary:
new_sub_dictionary[sub_key] += np.array(list)
accumulated_arrays[sub_key] += 1
# If we haven't encountered this sub_key
else:
new_sub_dictionary[sub_key] = np.array(list)
accumulated_arrays[sub_key] = 0
# Compute mean and convert back to list 
for sub_key, array in new_sub_dictionary.items():
new_sub_dictionary[sub_key] = list(array / accumulated_arrays[sub_key])
# Add to the main dictionary
new_dictionary[key] = new_sub_dictionary
return new_dictionary

相关内容

  • 没有找到相关文章

最新更新