如何在 Python 中设置段落句子的字数限制?



在列表中追加时需要设置限制。

sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'

我只需要在一句话中设置5个单词,并将其附加到列表中

输出应该-

sent_list = ['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']

试试这个:

words = sent.split(' ')
sent_list = [' '.join(words[n:n+5]) for n in range(0, len(words), 5)]

也许有点非正统:

sent_list = [re.sub(r's$','',x.group('pattern')) for x in 
re.finditer('(?P<pattern>([^s]+s){5}|.+$)',sent)]
['Python is dynamically-typed and garbage-collected.',
'It supports multiple programming paradigms,',
'including structured (particularly procedural), object-oriented',
'and functional programming.']

解释'(?P<pattern>([^s]+s){5}|.+$)':

  • (?P<pattern> ... ):化妆品,创建一个命名的捕获组
  • ([^s]+s){5}:查找后面跟着空白的非空白字符序列(一个或多个(;然后重复5次
  • |.+$:一旦第一个选项用完,只需将最后一个位一直到最后

我们使用re.finditer循环遍历所有match objects并获取与x.group('pattern')的匹配。除最后一个匹配项外,所有匹配项的末尾都将有一个额外的空格;消除它的一种方法是使用re.sub

sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'
sent_list = ['Python is dynamically-typed and garbage-collected.', 
'It supports multiple programming paradigms,', 
'including structured (particularly procedural), object-oriented', 
'and functional programming.']
new_list = []
inner_string = ""
sentence_list = sent.split(" ")
for idx, item in enumerate(sentence_list):
if (idx+1)==1 or (idx+1)%5 != 0:
if (idx+1) == len(sentence_list):
inner_string += item
new_list.append(inner_string)
else:
inner_string += item + " "
elif (idx+1)!=1 and (idx+1) % 5 == 0 :
inner_string += item
new_list.append(inner_string)
inner_string = ""

print(new_list)
print(new_list == sent_list)

输出:

['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']
True

最新更新