我有一个元组列表,每个元组有患者和访问,患者可以多次访问
我想得到病人的名单和每个病人的就诊名单
例如
[(patient1, visit), (patient2, visit), (patient1, visit)]
[(patient1, [visit, visit]), (patient2, [visit])]
我尝试了javascript的reduce
函数方法,但我不能真正理解我如何在python中做到这一点
defaultdict
方法是标准方法,具有线性复杂性。你也可以只使用普通字典和dict.setdefault
d = {}
for patient, visit in data:
d.setdefault(patient, []).append(visit)
[*d.items()]
# [('patient1', ['visit', 'visit']), ('patient2', ['visit'])]
对于单行方法(不包括导入)—尽管只是对数线性的,您可以使用itertools.groupby
:
from itertools import groupby
from operator import itemgetter as ig
[(k, [*map(ig(1), g)]) for k, g in groupby(sorted(data), key=ig(0))]
# [('patient1', ['visit', 'visit']), ('patient2', ['visit'])]
一些有用的文档:
itertools.groupby
dict.setdefault
operator.itemgetter
map
collections.defaultdict
您可以通过以下方式使用collections.defaultdict
:
from collections import defaultdict
d = defaultdict(list)
for patient, visit in data:
d[patient].append(visit)
使用示例
from itertools import groupby
# Example data
records = [('bill', '1/1/2021'), ('mary', '1/2/2021'), ('janet', '1/3/2021'), ('bill', '3/5/2021'), ('mary', '4/25/2021')]
# Group visits by patient names
g = groupby(sorted(records), lambda kv: kv[0]) # Group based upon first element of tuples (i.e. name)
# Sort so names are adjacent for groupby
# Using list comprehension on groupings to provided desired tuples
result = [(name, [d[1] for d in visit]) for name, visit in g]
以上代码作为一行代码
result = [(name, [d[1] for d in visit]) for name, visit in groupby(sorted(records), lambda kv: kv[0])]