所有五个字母的单词,三个字符重复?



我创建了 5 个循环,这些循环穿过字母表中的每个字母,然后将它们全部添加到列表中。然后我拿起那个列表,对于每个字母,我检查该字母是否有 3 个字符重复,然后中断。但是,这显然效率极低,我的计算机需要很长时间才能返回结果。有没有人对我如何做到这一点并改进运行时有更好的想法? 例如:cccde,yxxxz 顺便说一句,我无法导入数学,随机,字符串以外的其他库

#generate all possible 5 letter combinations without repeats 
for one in alpha:
for two in alpha:
for three in alpha:
for four in alpha: 
for five in alpha:
key = str(one+two+three+four+five)
if (key in list1) == False:
list1.append(key)
#filter down to ones with three repeated letters:     
for word in list1:
for letter in word: 
if word.count(letter) == 3:
list2.append(word)
break```
  • 使用字母表上的itertools.product生成所有 3 个字母的字符串,重复 3。
  • 遍历这些排列;对于每个排列,每个字母三倍,给出三个不同的 5 个字母的解决方案。 例如,从abc,你将生成aaabcabbbcabccc
  • 删除重复项:带有双倍(或三倍(字母的 3 个字母单词将产生重复项。 例如,aab将以两种不同的方式生成aaaab

这是否足够提示?

所有有效字符串最多只有 3 个不同的字母。要么重复第一个,要么重复第二个,要么第三个。我们需要一些测试来防止之前生成的字符串再次添加:

alpha = list('abcdefghijklmnopqrstuvwxyz')
l = []
for one in alpha:
for two in alpha:
for three in alpha:
l.append(one+one+one+two+three)
if one != two:
l.append(one+two+two+two+three)
if two != three:
l.append(one+two+three+three+three)
print(len(l), l)

如果您将问题解释为重复的字母不需要连续,则程序可能会更改为:

alpha = list('abcdefghijklmnopqrstuvwxyz')
l = []
for one in alpha:
for two in alpha:
for three in alpha:
l.append(one+one+one+two+three)
if one != two:
l.append(one + one + two + one + three)
l.append(one + two + one + one + three)
l.append(two + one + one + one + three)
if one != three:
l.append(one + one + two + three + one)
l.append(one + two + one + three + one)
l.append(one + two + three + one + one)
l.append(two + one + one + three + one)
l.append(two + one + three + one + one)
l.append(two + three + one + one + one)
print(len(l))
l = set(l)
print(len(l))

这将打印两次165776。当然,省略重复测试并转换为集合更简单,但乐趣不那么有趣。

代码可以很容易地调整为不允许 4 或 5 次重复的字母,只需有一个周围的 if-test 来检查one != two and one != three。在这种情况下,有 162500 个解决方案。

165776 也是@AlexHall的计数,如果代码更改为允许 3 或 4 次重复。和 162500 用于正好 3 次重复的情况。

为了完整起见:至少连续三个字母的情况有 51376 个解决方案。而正好三个连续字母的情况有50700个解决方案。

当前算法的最大问题是检查if (key in list1) == False:。没有理由这样做,因为循环中永远不会有任何重复项,并且它使您的算法成为 O(n^2( 而不是 O(n(,其中 n 是列表的大小。此代码在合理的时间内运行:

import string
alpha = string.ascii_lowercase
list1 = []
for one in alpha:
for two in alpha:
for three in alpha:
for four in alpha:
for five in alpha:
word = one + two + three + four + five
for letter in word:
if word.count(letter) == 3:
list1.append(word)
break
print(len(list1))

Beloveditertoolstoo(基于Prune的方法(:

from itertools import product, chain
def grow(n):
'''takes word of three letters and grows another three words'''
#example: abc -> [aaabc, abbbc, abccc]
branch1 = n[0]*3+n[1]+n[2]
branch2 = n[0]+n[1]*3+n[2]
branch3 = n[0]+n[1]+n[2]*3
return branch1, branch2, branch3
triple_combinations_generator = product(alpha, repeat=3)
triple_combinations = map(lambda x: x[0]+x[1]+x[2], triple_combinations_generator)
grown_triple_combinations= map(lambda x: grow(x), triple_combinations)
merged_triples = chain(*grown_triple_combinations)
result = set(merged_triples)
print(result)

我有一个建议,只有一个 for 循环,这在我的 PC 上相当快。 首先,您需要 5 个集合(数学集合(的笛卡尔积来创建具有 3 个重复字符的所有 5 个字母单词。 之后,您可以生成这 5 个字母单词的所有排列以生成新单词。 要清理重复的单词,您可以将上一步的输出映射到设置(删除重复的项目(。

from itertools import product, permutations
import string
letters = list(string.ascii_lowercase)
result = []
for repeated in string.ascii_lowercase:
letters.remove(repeated)
five_letter_repeated_at_end = product(letters, letters, [repeated], [repeated], [repeated])
five_letter_words = map(lambda word: set(permutations(word)), five_letter_repeated_at_end)
current_letter_words = []
for permutes in five_letter_words:
permutes = list(map(''.join, permutes))
current_letter_words.extend(list(permutes))
result.extend(current_letter_words)
letters.append(repeated)
print(result)

您可以使用itertools.combinations从 5 个索引中挑选 2 个插入非重复字母表,并使用itertools.permutationsalpha中挑选 3 个字母进行排列,其中第一个字母表要重复 3 次,并使用itertools.product将组合和排列组合成一个序列。

请注意,由于您已将问题编辑为不允许导入itertools,因此您只需将等效代码(如文档中所示(内联复制并粘贴即可避免导入。

from itertools import combinations, permutations, product
output = []
for insert_at, (repeated, *inserted) in product(
combinations(range(5), 2), permutations(alpha, r=3)):
output.append(''.join(inserted.pop() if i in insert_at else repeated for i in range(5)))

这样给定alpha = 'abcd'output将变为:

['cbaaa',
'dbaaa',
'bcaaa',
'dcaaa',
'bdaaa',
'cdaaa',
'cabbb',
'dabbb',
'acbbb',
'dcbbb',
'adbbb',
'cdbbb',
'baccc',
'daccc',
'abccc',
'dbccc',
'adccc',
'bdccc',
'baddd',
'caddd',
'abddd',
'cbddd',
'acddd',
'bcddd',
'cabaa',
'dabaa',
'bacaa',
'dacaa',
'badaa',
'cadaa',
'cbabb',
'dbabb',
'abcbb',
'dbcbb',
'abdbb',
'cbdbb',
'bcacc',
'dcacc',
'acbcc',
'dcbcc',
'acdcc',
'bcdcc',
'bdadd',
'cdadd',
'adbdd',
'cdbdd',
'adcdd',
'bdcdd',
'caaba',
'daaba',
'baaca',
'daaca',
'baada',
'caada',
'cbbab',
'dbbab',
'abbcb',
'dbbcb',
'abbdb',
'cbbdb',
'bccac',
'dccac',
'accbc',
'dccbc',
'accdc',
'bccdc',
'bddad',
'cddad',
'addbd',
'cddbd',
'addcd',
'bddcd',
'caaab',
'daaab',
'baaac',
'daaac',
'baaad',
'caaad',
'cbbba',
'dbbba',
'abbbc',
'dbbbc',
'abbbd',
'cbbbd',
'bccca',
'dccca',
'acccb',
'dcccb',
'acccd',
'bcccd',
'bddda',
'cddda',
'adddb',
'cdddb',
'adddc',
'bdddc',
'acbaa',
'adbaa',
'abcaa',
'adcaa',
'abdaa',
'acdaa',
'bcabb',
'bdabb',
'bacbb',
'bdcbb',
'badbb',
'bcdbb',
'cbacc',
'cdacc',
'cabcc',
'cdbcc',
'cadcc',
'cbdcc',
'dbadd',
'dcadd',
'dabdd',
'dcbdd',
'dacdd',
'dbcdd',
'acaba',
'adaba',
'abaca',
'adaca',
'abada',
'acada',
'bcbab',
'bdbab',
'babcb',
'bdbcb',
'babdb',
'bcbdb',
'cbcac',
'cdcac',
'cacbc',
'cdcbc',
'cacdc',
'cbcdc',
'dbdad',
'dcdad',
'dadbd',
'dcdbd',
'dadcd',
'dbdcd',
'acaab',
'adaab',
'abaac',
'adaac',
'abaad',
'acaad',
'bcbba',
'bdbba',
'babbc',
'bdbbc',
'babbd',
'bcbbd',
'cbcca',
'cdcca',
'caccb',
'cdccb',
'caccd',
'cbccd',
'dbdda',
'dcdda',
'daddb',
'dcddb',
'daddc',
'dbddc',
'aacba',
'aadba',
'aabca',
'aadca',
'aabda',
'aacda',
'bbcab',
'bbdab',
'bbacb',
'bbdcb',
'bbadb',
'bbcdb',
'ccbac',
'ccdac',
'ccabc',
'ccdbc',
'ccadc',
'ccbdc',
'ddbad',
'ddcad',
'ddabd',
'ddcbd',
'ddacd',
'ddbcd',
'aacab',
'aadab',
'aabac',
'aadac',
'aabad',
'aacad',
'bbcba',
'bbdba',
'bbabc',
'bbdbc',
'bbabd',
'bbcbd',
'ccbca',
'ccdca',
'ccacb',
'ccdcb',
'ccacd',
'ccbcd',
'ddbda',
'ddcda',
'ddadb',
'ddcdb',
'ddadc',
'ddbdc',
'aaacb',
'aaadb',
'aaabc',
'aaadc',
'aaabd',
'aaacd',
'bbbca',
'bbbda',
'bbbac',
'bbbdc',
'bbbad',
'bbbcd',
'cccba',
'cccda',
'cccab',
'cccdb',
'cccad',
'cccbd',
'dddba',
'dddca',
'dddab',
'dddcb',
'dddac',
'dddbc']

这是只有 4 行的尝试:

# Start with set of all letters
s = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'}
# Get all combinations of 5 letters
allcombinations = [(a,b,c,d,e) for a in s for b in s for c in s for d in s for e in s]
# Transform to list
biglist = list(allcombinations)
# Filter for only those that have 3 or more repeats and flatten
result = [a for x in s for a in list(filter(lambda y:y.count(x)==3, biglist))]

这是一个时序测试(为了避免永远等待,我在时序测试中将循环数减少到 3 个,将匹配数减少到 2(:

# My version
def v1():
s = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'}
allcombinations = [(a,b,c) for a in s for b in s for c in s]
biglist = list(allcombinations)
result = [a for x in s for a in list(filter(lambda y:y.count(x)==2, biglist))]
return result
# OP's version
def v2():
# generate all possible 5 letter combinations without repeats
alpha = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
'w', 'x', 'y', 'z'}
list1 = []
for one in alpha:
for two in alpha:
for three in alpha:
key = str(one + two + three)
if (key in list1) == False:
list1.append(key)
# filter down to ones with three repeated letters:
list2 = []
for word in list1:
for letter in word:
if word.count(letter) == 2:
list2.append(word)
break
return list2
import time
start_time = time.time()
res1 = v1()
print(f'My version: {len(res1)} entries')
print(f'--- {time.time() - start_time} seconds ---')
start_time = time.time()
res2 = v2()
print(f'OP's version: {len(res2)} entries')
print(f'--- {time.time() - start_time} seconds ---')

哪些输出:

My version: 1950 entries
--- 0.06479763984680176 seconds ---
OP's version: 1950 entries
--- 1.5608229637145996 seconds ---

最新更新