按顺序对列表进行高效排序



假设我有两个列表:

sequence = [25, 15, 20, 15, 25, 25]
l = [(25, 'banana'), 
     (25, 'apple'), 
     (25, 'pine'), 
     (20, 'soap'), 
     (15, 'rug'), 
     (15, 'cloud')]

我想按顺序对第二个列表 l 进行排序。在示例中,数字 25 多次出现,在这种情况下,只要哪个元组的值为 25,就无关紧要。列表将始终具有相同的长度。

我目前的做法是这样的:

r = list(range(len(sequence)))
for i, v in enumerate(sequence):
    for e in l:
        if e[0] == v:
            r[i] = e
            l.remove(e)
print(r)

可能的输出:

[(25,"香蕉"(, (15,"地毯"(, (20,"肥皂"(, (15,"云"( (25,"苹果"(, (25、"松树"(]

您看到更好的方法了吗?

感谢您的帮助!

马夫

是的。首先创建一个默认字典,其中数字作为键,名称作为每个键的值(作为列表(

sequence = [25, 15, 20, 15, 25, 25]
l = [(25, 'banana'),
     (25, 'apple'),
     (25, 'pine'),
     (20, 'soap'),
     (15, 'rug'),
     (15, 'cloud')]
from collections import defaultdict
d = defaultdict(list)
for i,n in l:
    d[i].append(n)

然后,迭代序列并使用list.pop从相关列表(匹配编号(中删除,一次删除一个项目(每个列表中必须有足够的项目并且键必须在那里,否则你会得到 Python 异常(空列表/键错误((:

result = [(i,d[i].pop()) for i in sequence]
print(result)

结果:

[(25, 'pine'), (15, 'cloud'), (20, 'soap'), (15, 'rug'), (25, 'apple'), (25, 'banana')]

顺序与预期的输出不同,但数字与名称匹配,这就是重点。如果你想要相同的顺序,只需删除第一个项目(列表中性能较低,所以如果你有选择,最好在最后一个列表中插入和删除项目,它更快(:

result = [(i,d[i].pop(0)) for i in sequence]

给:

[(25, 'banana'), (15, 'rug'), (20, 'soap'), (15, 'cloud'), (25, 'apple'), (25, 'pine')]

另一种选择是使用键函数进行排序,该函数将从sequence中删除使用的元素(此方法修改sequence,因此如果以后需要sequence,则应创建副本(:

sequence = [25, 15, 20, 15, 25, 25]
l = [(25, 'banana'), 
     (25, 'apple'), 
     (25, 'pine'), 
     (20, 'soap'), 
     (15, 'rug'), 
     (15, 'cloud')]
def key_func(_tuple):
    idx = sequence.index(_tuple[0])
    sequence[idx] = None
    return idx
l.sort(key=key_func)

正如Jared Goguen所说,如果您需要保留sequence,下一个包装器将有所帮助:

def get_key_func(sequence):
    sequence_copy = sequence[:]
    def key_func(_tuple):
        idx = sequence_copy.index(_tuple[0])
        sequence_copy[idx] = None
        return idx
    return key_func
l.sort(key=get_key_func(sequence))

我的想法与 Jean 的想法相似,但我使用 list 迭代器而不是 pop 方法(如果您从前面弹出,则以 O(n( 运行,但如果您从末尾弹出,则在 O(1( 中运行(。

>>> from collections import defaultdict
>>> supply = defaultdict(list)
>>> for k, v in l:
...     supply[k].append(v)
... 
>>> supply_iter = {k:iter(v) for k,v in supply.items()}
>>> [(k, next(supply_iter[k])) for k in sequence]
[(25, 'banana'), (15, 'rug'), (20, 'soap'), (15, 'cloud'), (25, 'apple'), (25, 'pine')]

next方法还允许使用可选的默认值作为第二个参数(None在这里是一个不错的选择(。

您可以在循环之前不设置数组且无需枚举即可执行此操作。我不认为它更快,但可能更容易理解:

r =[]
for val in sequence:
    for key, elem in l:
        if key == val:
            temp = (val, elem)
            r.append(temp)
            l.remove(temp)
            break # break the loop thru element to avoid having 2 elements of the same "key"
print(r)

另一种方法,

sequence = [25, 15, 20, 15, 25, 25]
list1 = [(25, 'banana'), 
     (25, 'apple'), 
     (25, 'pine'), 
     (20, 'soap'), 
     (15, 'rug'), 
     (15, 'cloud')]
     
_dict = {}
# organised duplicates into dict
for a, b in list1 :
    _dict.setdefault(a, []).append(b)
print(_dict)
index_list = []
# append based on sequence using pop to avoid duplicates 
for key in sequence:
    next_in_line = _dict[key].pop(0)
    index_list.append((key, next_in_line))
   
print(index_list)

{25: ['banana', 'apple', 'pine'], 20: ['soap'], 15: ['rug', 'cloud']}
[(25, 'banana'), (15, 'rug'), (20, 'soap'), (15, 'cloud'), (25, 'apple'), (25, 'pine')]
[Program finished]

相关内容

  • 没有找到相关文章

最新更新