记录以这种形式存储在数据库中:
<表类>
clu
名称
b
tbody><<tr>q 一个 7 6 q两个 6 9 q5 6 10 e3 8 7 e四 10 4 表类>
我不知道您的原始数据是如何存储的(我不认识" applications .objects.values"),但是这里的代码将根据一个简单的列表列表计算这些平均值:
data = [
['q','one',7,6],
['q','two',7,6],
['q','five',7,6],
['e','three',7,6],
['e','four',7,6]
]
e_dict = {
'one': {'u_mean': 4.25, 'c_mean': 4.25},
'three': {'u_mean': 4.5, 'c_mean': 4.5},
'two': {'u_mean': 4.583333333333334, 'c_mean': 4.583333333333334},
'four': {'u_mean': 4.5625, 'c_mean': 4.5625},
'five': {'u_mean': 4.65, 'c_mean': 4.65}
}
def group_names():
sums = {}
counts = {}
for h in data:
if h[0] not in sums:
sums[h[0]] = { "u_mean": 0, "c_mean": 0 }
counts[h[0]] = 0
for k,v in e_dict[h[1]].items():
sums[h[0]][k] += v
counts[h[0]] += 1
for k,v in sums.items():
sums[k]['u_mean'] /= counts[k]
sums[k]['c_mean'] /= counts[k]
return sums
print(group_names())
输出:
{'q': {'u_mean': 4.4944444444444445, 'c_mean': 4.4944444444444445}, 'e': {'u_mean': 4.53125, 'c_mean': 4.53125}}
您可以使用pandas:
输入:
data = {'clu': {0: 'q', 1: 'q', 2: 'q', 3: 'e', 4: 'e'}, 'name': {0: 'one', 1: 'two', 2: 'five', 3: 'three', 4: 'four'}, 'a': {0: 7, 1: 6, 2: 6, 3: 8, 4: 10}, 'b': {0: 6, 1: 9, 2: 10, 3: 7, 4: 4}}
e_dict = {'one': {'u_mean': 4.25, 'c_mean': 4.25}, 'three': {'u_mean': 4.5, 'c_mean': 4.5}, 'two': {'u_mean': 4.583333333333334, 'c_mean': 4.583333333333334}, 'four': {'u_mean': 4.5625, 'c_mean': 4.5625}, 'five': {'u_mean': 4.65, 'c_mean': 4.65}}
代码:
import pandas as pd
df = pd.DataFrame(data)
df_e = pd.DataFrame(e_dict).T.rename_axis('name').reset_index()
df = df.merge(df_e, on=['name'])
df = df.groupby(['clu']).agg({'u_mean':'mean', 'c_mean':'mean'})
df.to_dict(orient='index')
输出:
{'e': {'u_mean': 4.53125, 'c_mean': 4.53125},
'q': {'u_mean': 4.4944444444444445, 'c_mean': 4.4944444444444445}}
解释:
Pandas
允许以表格形式处理数据,类似于您在示例中展示的数据。第一个表(DataFrame)是df,第二个表包含查找值(e_dict),但经过了一点预处理(转置和重命名列)。
然后我们根据它们的名称值合并两个表,这样你在第一个表中就有了u_mean和c_mean的对应值。
现在我们将clue
值分组,并将mean
值汇总。
最后返回表作为字典。