假设我想在Python 2.7中实现一个解决方案。
我有一个字符串列表,例如a=[AA','BB','CC','DD']。
期望的输出将是a的一组不相交子集,例如a_1、a_2。。。A_N,使得
(A_1 U A_2 U ... U A_N) = A,
(A_1 ∩ A_2 ∩ ... ∩ A_N) = Ø,
同时考虑A中元素的顺序(A_1,A_2,…,A_N不能包含A中的非相邻元素)。
对于A,这些将是:
A_1,A_2。。。A_N:
- ['AA','BB','CC','DD']
- ['AA'],['b','CC','DD']
- ['AA','BB'],['CC','DD']
- ['AA','BB','CC'],['DD']
- ['AA'],['b'],[c'],[d'd']
- ['AA','BB'],[CC'],[DD']
- ['AA'],['b','CC'],[DD']
- ['AA'],['b'],'CC','DD']
(希望我没有遗漏,但我想你明白了)
重点是效率——这意味着相对快速,不会浪费太多内存。我知道,对于一个更大的列表,组合的数量可能会激增,但我的列表永远不会超过5个元素。
我在这里找到了一个类似问题的答案,唯一的区别是我想要所有的子集,而他们只需要最大长度为2的子集。
该解决方案相当于找到整数的所有可能组合,这些整数的总和为n(输入列表的长度),然后将该解决方案重新映射到单词列表以找到其子集。
他们问题的伪代码:
push an empty list onto the stack;
while (the stack is not empty) {
pop the top list off the stack;
if (the sum of its entries is n)
add it to the solution set;
else if (the sum of its entries is less than n)
add a 1 to a copy of the list and push it onto the stack;
add a 2 to a copy of the list and push it onto the stack;
}
}
此问题的伪代码(扩展):
push an empty list onto the stack;
while (the stack is not empty) {
pop the top list off the stack;
if (the sum of its entries is n)
add it to the solution set;
else if (the sum of its entries is less than n)
for j = 1:n {
add j to a copy of the list and push it onto the stack;
}
}
}
我的Python实现:
import copy
def generate_subsets(words):
# get length of word list
list_len = len(words)
# initialize stack, subset_lens list
stack = [[], ]
subset_lens = []
while stack:
current_item = stack.pop(-1)
if sum(current_item) == list_len:
subset_lens.append(current_item)
elif sum(current_item) < list_len:
for j in range(1, list_len+1):
new_item = copy.deepcopy(current_item)
new_item.append(j)
stack.append(new_item)
# remap subset lengths to actual word subsets
subsets = []
for subset_len in subset_lens:
subset = []
starting_index = 0
for index in subset_len:
subset.append('_'.join(words[starting_index:starting_index+index]))
starting_index+= index
subsets.append(subset)
return subsets
输入:
generate_subsets(['AA', 'BB', 'CC', 'DD'])
输出:
['AA_BB_CC_DD']
['AA_BB_CC', 'DD']
['AA_BB', 'CC_DD']
['AA_BB', 'CC', 'DD']
['AA', 'BB_CC_DD']
['AA', 'BB_CC', 'DD']
['AA', 'BB', 'CC_DD']
['AA', 'BB', 'CC', 'DD']
如果有人找到更有效的解决方案,我很乐意在回复/评论中看到它!