如何根据字典中的键值创建有序的键列表?



我在Python中遇到了这个问题,我很确定它应该很容易解决。

请查找这个字典示例:

d = {
"a": "abc1",
"b": "abc1",
"c": "abc2",
"d": "abc3",
"e": "abc3",
"f": "abc3",
"g": "abc4"
}

现在我想创建一个列表,其中'a'到'g'将按顺序排列,尽可能多地混合abc值,但我需要列表中的所有键,所以:

['a','c','d','g','b','e'] 

那么'f'将被留下(因为e也有值abc3),并且可以添加到剩余列表中。

我试了如下:

s = []
for x in d: 
if len(s) < 1:
s.append(x)
if d[s[-1]] is not d[x]:
s.append(x)

但是这只会产生:

['a', 'c', 'd', 'g']

我需要回去再试一次,直到没有解决方案。

非常感谢您的时间和建议!

这是我得到工作的第一个代码。现在也去掉了f,还有别的想法吗?

d = {
"a": "abc1",
"b": "abc1",
"c": "abc2",
"d": "abc3",
"e": "abc3",
"f": "abc3",
"g": "abc4"
}
list_number = {}
key_lists = []
for key, value in d.items():
if value in list_number:
index = list_number[value] = list_number[value] + 1
else:
index = list_number[value] = 0
if index < len(key_lists):
key_lists[index].append(key)
else:
key_lists.append([key])
result = []
remaining = []
for key_list in key_lists:
if not result or len(key_list) > 1:
result.extend(key_list)
else:
remaining.extend(key_list)
print("# Result", result)
print("# Remaining:", remaining)
# Result ['a', 'c', 'd', 'g', 'b', 'e']
# Remaining: ['f']

我的解决方案是:

d = {
"a": "abc1",
"b": "abc1",
"c": "abc2",
"d": "abc3",
"e": "abc3",
"f": "abc3",
"g": "abc4"
}
counted = {}
done = False
while not done:
# Assume done
done = True
last_value = None
for key, value in d.items():
if key not in counted and value != last_value:
done = False
last_value = value
# Add the key to the result dict, the value 1 is arbitrary
counted[key] = 1
result = list(counted)
print(f"Result: {result}")

输出:

Result: ['a', 'c', 'd', 'g', 'b', 'e', 'f']

指出

  • counted字典将具有与d相同的键,但按我们想要的方式排序。这些值是任意的。我本可以使用列表,但如果d是一个大字典,则查找key not in counted将花费更长的时间。
  • while循环将继续运行,直到我们将所有键添加到counted字典中。
  • 内部for循环将遍历键/值,只添加符合if条件的键。

我在上面问的问题是把你的非正式问题变成可以很容易地在代码中实现的东西。在这种情况下,你似乎在问:

  • d中的每个key都应该出现在输出中,如果可能的话
  • 不能跟在具有相同值的键后面

要解决这个问题,它有助于找出我们可以施加哪些约束来使算法更容易。在这种情况下,困难似乎是那些共享相同值的键。这意味着我们应该首先尝试处理这些问题,因此我的问题是d a e b f c g是否是一个有效的解决方案。我看到abc3出现得最多,所以我从它开始。

在将其转换为代码时,一个关键概念是优先级队列。在标准heapq Python模块中可以实现此功能。

那么编写代码和处理边缘情况就相对简单了。我在里面放了一些注释,试图解释它在做什么:

from heapq import heapify, heappop, heappush
from collections import defaultdict
from random import shuffle, random
def fn(d: dict, *, randomize=True):
# get items from dictionary
if randomize:
# add random value to keep heap random
mk = lambda l: (-(len(l) + random()), l)
items = list(d.items())
# ensure keys under each value are randomized
shuffle(items)
else:
mk = lambda l: (-len(l), l)
items = d.items()
# collect all the keys together under their values
acc = defaultdict(list)
for k, v in items:
acc[v].append(k)
# build a priority queue, longest list of keys first
queue = [mk(ks) for _, ks in acc.items()]
heapify(queue)
# pull out distinct values
result = []
while queue:
# add an item from the longest list
_, ks = heappop(queue)
result.append(ks.pop())
# if there are other keys available
if queue:
# add an item from second longest list
_, ks2 = heappop(queue)
result.append(ks2.pop())
if ks2:
# push remaining items back onto the queue
heappush(queue, mk(ks2))
elif ks:
# no other keys available and this one has entries, so we're stuck
break
# push remaining items back onto the queue
if ks:
heappush(queue, mk(ks))
# done
return result
# test
fn({"a": 1, "b": 1, "c": 2, "d": 3, "e": 3, "f": 3, "g": 4}, randomize=False)

打印:

['f', 'b', 'e', 'a', 'c', 'd', 'g']

因为您最初标记了随机,所以当randomize=False未通过时,我使其随机化输出。这是默认情况下它所做的,例如,fn(d)可能会给出:

['d', 'a', 'f', 'b', 'c', 'e', 'g']

还有另一种方法:

from collections import defaultdict
from itertools import zip_longest, chain
d = {
"a": "abc1",
"b": "abc1",
"c": "abc2",
"d": "abc3",
"e": "abc3",
"f": "abc3",
"g": "abc4"
}
# Create and populate default dict of "value": [key1, key2, ...]
vk = defaultdict(list)
[vk[v].append(k) for k,v in d.items()]
# Zip the values together column by column and remove any columns
# with only one key because this means it's repeating.
ll = [c for c in [[b for b in a if b is not None]
for a in zip_longest(*vk.values())]
if len(c) > 1]
# Flatten the list of lists into list of keys.
keys = list(chain(*ll))
print(keys)

这回报:

['a', 'c', 'd', 'g', 'b', 'e']

工作方式是首先将d的值和键转换为vk的defaultdict:

> vk = defaultdict(list)
> [vk[v].append(k) for k,v in d.items()]
> vk
defaultdict(list,
{'abc1': ['a', 'b'],
'abc2': ['c'],
'abc3': ['d', 'e', 'f'],
'abc4': ['g']})

下一步要做几件事。zip_longest()函数返回如下:

> list(zip_longest(*dd.values()))
[('a', 'c',   'd', 'g'),
('b',  None, 'e',  None),
(None, None, 'f',  None)]

[b for b in a if b is not None]位删除了所有None项,所以现在剩下:

> [[b for b in a if b is not None] for a in zip_longest(*dd.values())]
[['a', 'c', 'd', 'g'], ['b', 'e'], ['f']]

从这里我们只保留任何比len(c) > 1的单个元素长的子列表。这是因为单个元素必须是它上面那一行的重复(参见list(zip_longest(*dd.values()))结构)。这给我们留下了列表ll的列表。

> ll
[['a', 'c', 'd', 'g'], ['b', 'e']]

我们用chain(*ll)来平化它,它返回一个迭代器,我们把它转换成一个原始键的列表,我们可以做我们想做的事情:

> keys = list(chain(*ll))
> print(keys)
['a', 'c', 'd', 'g', 'b', 'e']

相关内容

  • 没有找到相关文章

最新更新