形成嵌套列表,计算字段之和



记录以这种形式存储在数据库中:

<表类> clu 名称 b tbody><<tr>q一个76q两个69q5610e387e四104

我不知道您的原始数据是如何存储的(我不认识" applications .objects.values"),但是这里的代码将根据一个简单的列表列表计算这些平均值:


data = [
['q','one',7,6],
['q','two',7,6],
['q','five',7,6],
['e','three',7,6],
['e','four',7,6]
]
e_dict = {
'one': {'u_mean': 4.25, 'c_mean': 4.25},
'three': {'u_mean': 4.5, 'c_mean': 4.5},
'two': {'u_mean': 4.583333333333334, 'c_mean': 4.583333333333334},
'four': {'u_mean': 4.5625, 'c_mean': 4.5625},
'five': {'u_mean': 4.65, 'c_mean': 4.65}
}
def group_names():
sums = {}
counts = {}
for h in data:
if h[0] not in sums:
sums[h[0]] = { "u_mean": 0, "c_mean": 0 }
counts[h[0]] = 0
for k,v in e_dict[h[1]].items():
sums[h[0]][k] += v
counts[h[0]] += 1
for k,v in sums.items():
sums[k]['u_mean'] /= counts[k]
sums[k]['c_mean'] /= counts[k]
return sums
print(group_names())

输出:

{'q': {'u_mean': 4.4944444444444445, 'c_mean': 4.4944444444444445}, 'e': {'u_mean': 4.53125, 'c_mean': 4.53125}}

您可以使用pandas:

输入:

data = {'clu': {0: 'q', 1: 'q', 2: 'q', 3: 'e', 4: 'e'}, 'name': {0: 'one', 1: 'two', 2: 'five', 3: 'three', 4: 'four'}, 'a': {0: 7, 1: 6, 2: 6, 3: 8, 4: 10}, 'b': {0: 6, 1: 9, 2: 10, 3: 7, 4: 4}}
e_dict = {'one': {'u_mean': 4.25, 'c_mean': 4.25}, 'three': {'u_mean': 4.5, 'c_mean': 4.5}, 'two': {'u_mean': 4.583333333333334, 'c_mean': 4.583333333333334}, 'four': {'u_mean': 4.5625, 'c_mean': 4.5625}, 'five': {'u_mean': 4.65, 'c_mean': 4.65}}

代码:

import pandas as pd
df = pd.DataFrame(data)
df_e = pd.DataFrame(e_dict).T.rename_axis('name').reset_index()
df = df.merge(df_e, on=['name'])
df = df.groupby(['clu']).agg({'u_mean':'mean', 'c_mean':'mean'})
df.to_dict(orient='index')

输出:

{'e': {'u_mean': 4.53125, 'c_mean': 4.53125},
'q': {'u_mean': 4.4944444444444445, 'c_mean': 4.4944444444444445}}

解释:

Pandas允许以表格形式处理数据,类似于您在示例中展示的数据。第一个表(DataFrame)是df,第二个表包含查找值(e_dict),但经过了一点预处理(转置和重命名列)。

然后我们根据它们的名称值合并两个表,这样你在第一个表中就有了u_mean和c_mean的对应值。

现在我们将clue值分组,并将mean值汇总。

最后返回表作为字典。

相关内容

  • 没有找到相关文章

最新更新