我正在做一个非常简单的计算,我在字典列表中找到相同的键/值对,通过求和来组合它们。假设数据是:
编辑:name &id是任意名称例如,我有一个非常大的字典,我使用多个键
输入>{
"name":"first",
"id":"1234",
"quantity":10
},
{
"name":"first",
"id":"1234",
"quantity":30
},
{
"name":"another",
"id":"0000",
"quantity":10
}
{
"name":"first",
"id":"1234",
"quantity":40
},
{
"name":"another",
"id":"0000",
"quantity":10
}
我很好奇如何做到这一点"python "方法,尽可能避免嵌套循环。
现在我拥有了我不满意的东西:
for entry in quantities:
for compare in quantities:
if id(entry) != id(compare):
if (entry["name"] == compare["name"]) and (entry["id"] == compare["id"]):
entry["quantity"] = entry["quantity"] + compare["quantity"]
quantities.remove(compare)
任何提示/建议都很感激,谢谢!
使用另一个字典并对键进行分组,我指的是"名称";和";id"(虽然,"id"
还不够吗?
类似:
grouper = {}
for q in quantities:
key = q['name'], q['id']
if key in grouper:
grouper[key]['quantity'] += q['quantity']
else:
grouper[key] = q.copy()
quantities = list(grouper.values())
在REPL中:
In [1]: quantities = [
...: {
...: "name":"first",
...: "id":"1234",
...: "quantity":10
...: },
...: {
...: "name":"first",
...: "id":"1234",
...: "quantity":30
...: },
...: {
...: "name":"another",
...: "id":"0000",
...: "quantity":10
...: }
...: ]
In [2]: grouper = {}
In [3]: for q in quantities:
...: key = q['name'], q['id']
...: if key in grouper:
...: grouper[key]['quantity'] += q['quantity']
...: else:
...: grouper[key] = q.copy()
...:
In [4]: grouper
Out[4]:
{('first', '1234'): {'name': 'first', 'id': '1234', 'quantity': 40},
('another', '0000'): {'name': 'another', 'id': '0000', 'quantity': 10}}
然后你可以直接从这些值中得到你的新列表:
In [5]: list(grouper.values())
Out[5]:
[{'name': 'first', 'id': '1234', 'quantity': 40},
{'name': 'another', 'id': '0000', 'quantity': 10}]
这种方法需要线性时间和线性空间。
注意,q.copy()
创建了一个浅拷贝,这在这里是可以的,但如果您的字典中有可变值,则可能不是这样。
还要注意,您可能需要重新考虑您的数据结构。你真的想要一张清单吗?如果您有一个唯一键,并且希望能够通过该键快速找到对象,则可能需要某种类型的字典。
方法一:使用groupby和reduce
from itertools import groupby
from functools import reduce
def merge(d1, d2):
' merge two dictionaries based upon summing key values not in grouper '
return {k:v if k in grouper else v + d2.get(k, 0) for k, v in d1.items()}
grouper = ("name", "id") # keys to groupby
lst.sort(key = lambda d:[d[key] for key in grouper]) # Sort list inplace based upon grouper keys
# Done inplace to save space
# Merge dicts in list in same group based upon merge function
outputlist = [(reduce(merge, g)) for _, g in groupby(lst, lambda d:[d[key] for key in grouper])]
[{'name': 'another', 'id': '0000', 'quantity': 10},
{'name': 'first', 'id': '1234', 'quantity': 40}]
方法2——使用Pandas
避免所有循环的一行代码(方法实际上复制了方法1)
outputlist = pd.DataFrame(lst).groupby(['name', 'id']).sum().reset_index().to_dict('records')
outputlist:
[{'name': 'another', 'id': '0000', 'quantity': 10},
{'name': 'first', 'id': '1234', 'quantity': 40}]
解释
pd.DataFrame(lst) - generate pandas DataFrame from list of dictionaries
groupby(['name', 'id']) - group rows by name & id
sum() - sum the non-grouped values in each group
reset_index() - reset index back to 0, 1, 2, ...
to_dict('records') - convert to list of dictionaries
with each row data as dictionary