将两个字典列表按2个值对过滤,然后将它们分组在一起



我有2个字典列表,假设:

List_D1 = [{'Symbol':'GFX','Time':'9:36am', 'Change':-0.18, 'Volume':181800},
            {'Symbol':'AIG','Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
            {'Symbol':'AXP','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
            ]
List_D2 = [{'Symbol':'AA','Time':'7:36am', 'Change':-0.08, 'Volume':181800},
            {'Symbol':'AIG','Time':'9:36am', 'Change':0.99, 'Volume': 197500},
            {'Symbol':'GFX','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
            ]

我想在具有相同"符号"one_answers"时间"值的单独列表中选择项目:在上面的示例中应该配对:

对1:

List_D1 : {'Symbol':'AIG','Time':'9:36am', 'Change':-0.15, 'Volume': 195500} 
List_D2 : {'Symbol':'AIG','Time':'9:36am', 'Change':0.99, 'Volume': 197500}

对2:

List_D1 :{'Symbol':'GFX','Time':'9:36am', 'Change':-0.18, 'Volume':181800}
List_D2 :{'Symbol':'GFX','Time':'9:36am', 'Change':-0.46, 'Volume': 935000}

现在,我只是在不同字典列表中进入每个条目,我想知道是否有更好的主意可以更有效地完成此操作?

我正在考虑使用python的 itemgetter sort(List_D1+List_D2),然后使用groupby函数将整个排序列表和我想要配对的组件配对。但是,这样做,我无法分辨哪个项目是哪个列表。

Here is my source code :
from operator import itemgetter
from itertools import groupby
ListsBoth = List_D1+List_D2
key1 = 'Symbol' 
key2 = 'Time'
grouper = itemgetter(key1,key2)
ResuListx2.sort(key=grouper)
for key, testItem in groupby(ListsBoth,key=grouper):
        // here I can group all items with same 'Symbol' AND 'Time' value together, but just missed the original "List" info - where each item in same group comes from. but I need it for my application.
    ...... handle each item in testItem ()

您可以将每个dict的列表转换为符号和时间的元组作为钥匙的元组,然后在两者之间进行简单的查找以创建所需的对,例如。:

In []:
D1 = {(d['Symbol'], d['Time']): d for d in List_D1}
D2 = {(d['Symbol'], d['Time']): d for d in List_D2}
[(D1.get(k, None), D2.get(k, None)) for k in set(D1) | set(D2)]
Out[]:
[({'Change': -0.18, 'Symbol': 'GFX', 'Time': '9:36am', 'Volume': 181800},
  {'Change': -0.46, 'Symbol': 'GFX', 'Time': '9:36am', 'Volume': 935000}),
 ({'Change': -0.15, 'Symbol': 'AIG', 'Time': '9:36am', 'Volume': 195500},
  {'Change': 0.99, 'Symbol': 'AIG', 'Time': '9:36am', 'Volume': 197500}),
 ({'Change': -0.46, 'Symbol': 'AXP', 'Time': '9:36am', 'Volume': 935000}, None),
 (None, {'Change': -0.08, 'Symbol': 'AA', 'Time': '7:36am', 'Volume': 181800})]

您可以通过将其更改为:

来消除任何无与伦比的对
[(D1[k], D2[k]) for k in D1 if k in D2]

现在,您可以迭代每对做需要做的事情,例如:

In []:
results = [(D1[k], D2[k]) for k in D1 if k in D2]
for l1, l2 in results:
    print(l1, l2)
Out[]:
{'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.18, 'Volume': 181800} {'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000}
{'Symbol': 'AIG', 'Time': '9:36am', 'Change': -0.15, 'Volume': 195500} {'Symbol': 'AIG', 'Time': '9:36am', 'Change': 0.99, 'Volume': 197500}
List_D1 = [{'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.18, 'Volume': 181800},
           {'Symbol': 'AIG', 'Time': '9:36am', 'Change': -0.15, 'Volume': 195500},
           {'Symbol': 'AXP', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000},
           ]
List_D2 = [{'Symbol': 'AA', 'Time': '7:36am', 'Change': -0.08, 'Volume': 181800},
           {'Symbol': 'AIG', 'Time': '9:36am', 'Change': 0.99, 'Volume': 197500},
           {'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000},
           ]
b = map(lambda x: x.get('Symbol') + '_' + x.get('Time'), List_D1)
c = map(lambda x: x.get('Symbol') + '_' + x.get('Time'), List_D2)
e = map(lambda x: (List_D1[b.index(x)], List_D2[c.index(x)]), set(b) & set(c))
for i in e:
    print(i)

您也可以使用itertools.groupby,然后仅保存带有多个结果的结果:

import itertools
List_D1 = [{'Symbol':'GFX','Time':'9:36am', 'Change':-0.18, 'Volume':181800},
        {'Symbol':'AIG','Time':'9:36am', 'Change':-0.15, 'Volume': 195500},
        {'Symbol':'AXP','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
        ]
List_D2 = [{'Symbol':'AA','Time':'7:36am', 'Change':-0.08, 'Volume':181800},
        {'Symbol':'AIG','Time':'9:36am', 'Change':0.99, 'Volume': 197500},
        {'Symbol':'GFX','Time':'9:36am', 'Change':-0.46, 'Volume': 935000},
        ]
d = [(a, list(b)) for a, b in itertools.groupby(sorted(List_D1+List_D2, key=lambda x:(x['Symbol'], x['Time'])), key=lambda x:(x['Symbol'], x['Time']))]
final_data = {a:b for a, b in d if len(b) > 1}

输出:

{('AIG', '9:36am'): [{'Symbol': 'AIG', 'Time': '9:36am', 'Change': -0.15, 'Volume': 195500}, {'Symbol': 'AIG', 'Time': '9:36am', 'Change': 0.99, 'Volume': 197500}], ('GFX', '9:36am'): [{'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.18, 'Volume': 181800}, {'Symbol': 'GFX', 'Time': '9:36am', 'Change': -0.46, 'Volume': 935000}]}

最新更新