如何将具有匹配子字符串的两个列表项合并，并将另一个子字符串添加到一起

我确实有一个列表，看起来像这样(只是为了这篇文章而缩短了它，因为它真的很长(：

[<Cell R105C123 1>, <Cell R27C123 8>, <Cell R139C115 1>, <Cell R139C115 1>, <Cell R19C111 2>, <Cell R19C119 2>]

假设这个列表上有两个项目，子字符串匹配到最后一个数字，例如：

<Cell R66C127 0.5>, <Cell R66C127 1>

我需要将其中两个合并为一个项目，在这种情况下：

<Cell R66C127 1.5>

我确实认为必须使用列表理解，然而，我的经验非常有限，如何做到这一点？

一种方法是使用collections.defaultdict来聚合结果，然后从中构建结果列表：

from collections import defaultdict
lst = ["<Cell R105C123 1>", "<Cell R27C123 8>", "<Cell R139C115 1>",
"<Cell R139C115 1>", "<Cell R19C111 2>", "<Cell R19C119 2>",
"<Cell R66C127 0.5>", "<Cell R66C127 1>"]
lookup = defaultdict(float)
for e in lst:
_, key, value = e.strip("<>").split()
lookup[key] += float(value)
res = [f"<Cell {key} {value}>" for key, value in lookup.items()]
print(res)

输出

['<Cell R105C123 1.0>', '<Cell R27C123 8.0>', '<Cell R139C115 2.0>', '<Cell R19C111 2.0>', '<Cell R19C119 2.0>', '<Cell R66C127 1.5>']

请注意，如果要合并的元素总是连续的，则使用itertools.groupby:，上述方法是一种通用的解决方案

from itertools import groupby
lst = ["<Cell R105C123 1>", "<Cell R27C123 8>", "<Cell R139C115 1>",
"<Cell R139C115 1>", "<Cell R19C111 2>", "<Cell R19C119 2>",
"<Cell R66C127 0.5>", "<Cell R66C127 1>"]

def key(e):
_, k, _ = e.strip("<>").split()
return k

def value(e):
_, _, v = e.strip("<>").split()
return float(v)

res = [f"<Cell {key} {sum(map(value, group))}>" for key, group in groupby(lst, key=key)]
print(res)

输出

['<Cell R105C123 1.0>', '<Cell R27C123 8.0>', '<Cell R139C115 2.0>', '<Cell R19C111 2.0>', '<Cell R19C119 2.0>', '<Cell R66C127 1.5>']

相关内容

最新更新

热门标签：