假设我有两个列表:
sequence = [25, 15, 20, 15, 25, 25]
l = [(25, 'banana'),
(25, 'apple'),
(25, 'pine'),
(20, 'soap'),
(15, 'rug'),
(15, 'cloud')]
我想按顺序对第二个列表 l 进行排序。在示例中,数字 25 多次出现,在这种情况下,只要哪个元组的值为 25,就无关紧要。列表将始终具有相同的长度。
我目前的做法是这样的:
r = list(range(len(sequence)))
for i, v in enumerate(sequence):
for e in l:
if e[0] == v:
r[i] = e
l.remove(e)
print(r)
可能的输出:
[(25,"香蕉"(, (15,"地毯"(, (20,"肥皂"(, (15,"云"( (25,"苹果"(, (25、"松树"(]
您看到更好的方法了吗?
感谢您的帮助!
马夫
是的。首先创建一个默认字典,其中数字作为键,名称作为每个键的值(作为列表(
sequence = [25, 15, 20, 15, 25, 25]
l = [(25, 'banana'),
(25, 'apple'),
(25, 'pine'),
(20, 'soap'),
(15, 'rug'),
(15, 'cloud')]
from collections import defaultdict
d = defaultdict(list)
for i,n in l:
d[i].append(n)
然后,迭代序列并使用list.pop
从相关列表(匹配编号(中删除,一次删除一个项目(每个列表中必须有足够的项目并且键必须在那里,否则你会得到 Python 异常(空列表/键错误((:
result = [(i,d[i].pop()) for i in sequence]
print(result)
结果:
[(25, 'pine'), (15, 'cloud'), (20, 'soap'), (15, 'rug'), (25, 'apple'), (25, 'banana')]
顺序与预期的输出不同,但数字与名称匹配,这就是重点。如果你想要相同的顺序,只需删除第一个项目(列表中性能较低,所以如果你有选择,最好在最后一个列表中插入和删除项目,它更快(:
result = [(i,d[i].pop(0)) for i in sequence]
给:
[(25, 'banana'), (15, 'rug'), (20, 'soap'), (15, 'cloud'), (25, 'apple'), (25, 'pine')]
另一种选择是使用键函数进行排序,该函数将从sequence
中删除使用的元素(此方法修改sequence
,因此如果以后需要sequence
,则应创建副本(:
sequence = [25, 15, 20, 15, 25, 25]
l = [(25, 'banana'),
(25, 'apple'),
(25, 'pine'),
(20, 'soap'),
(15, 'rug'),
(15, 'cloud')]
def key_func(_tuple):
idx = sequence.index(_tuple[0])
sequence[idx] = None
return idx
l.sort(key=key_func)
正如Jared Goguen所说,如果您需要保留sequence
,下一个包装器将有所帮助:
def get_key_func(sequence):
sequence_copy = sequence[:]
def key_func(_tuple):
idx = sequence_copy.index(_tuple[0])
sequence_copy[idx] = None
return idx
return key_func
l.sort(key=get_key_func(sequence))
我的想法与 Jean 的想法相似,但我使用 list 迭代器而不是 pop
方法(如果您从前面弹出,则以 O(n( 运行,但如果您从末尾弹出,则在 O(1( 中运行(。
>>> from collections import defaultdict
>>> supply = defaultdict(list)
>>> for k, v in l:
... supply[k].append(v)
...
>>> supply_iter = {k:iter(v) for k,v in supply.items()}
>>> [(k, next(supply_iter[k])) for k in sequence]
[(25, 'banana'), (15, 'rug'), (20, 'soap'), (15, 'cloud'), (25, 'apple'), (25, 'pine')]
next
方法还允许使用可选的默认值作为第二个参数(None
在这里是一个不错的选择(。
您可以在循环之前不设置数组且无需枚举即可执行此操作。我不认为它更快,但可能更容易理解:
r =[]
for val in sequence:
for key, elem in l:
if key == val:
temp = (val, elem)
r.append(temp)
l.remove(temp)
break # break the loop thru element to avoid having 2 elements of the same "key"
print(r)
另一种方法,
sequence = [25, 15, 20, 15, 25, 25]
list1 = [(25, 'banana'),
(25, 'apple'),
(25, 'pine'),
(20, 'soap'),
(15, 'rug'),
(15, 'cloud')]
_dict = {}
# organised duplicates into dict
for a, b in list1 :
_dict.setdefault(a, []).append(b)
print(_dict)
index_list = []
# append based on sequence using pop to avoid duplicates
for key in sequence:
next_in_line = _dict[key].pop(0)
index_list.append((key, next_in_line))
print(index_list)
给
{25: ['banana', 'apple', 'pine'], 20: ['soap'], 15: ['rug', 'cloud']}
[(25, 'banana'), (15, 'rug'), (20, 'soap'), (15, 'cloud'), (25, 'apple'), (25, 'pine')]
[Program finished]