这个python字符串怎么能分解



我需要将python字符串分解为字符串列表,格式如下例所示。

示例

有一个python字符串看起来像:

[1]输入:

'they <are,are not> sleeping'

它必须变成一个列表,看起来像:

[1]输出:

['they are sleeping', 'they are not sleeping'] 

另一个例子

[2]输入:

'hello <stupid,smart> <people,persons,animals>'

[2]输出:

['hello stupid people', 'hello stupid persons', 'hello stupid animals', 'hello smart people', 'hello smart persons', 'hello smart animals'] 

备选选项存在于标记中,并且需要在考虑所有可能性的情况下生成新的字符串。

试试这个:

import itertools
def all_choices(text):
"""
>>> list(all_choices('they <are,are not> sleeping'))
['they are sleeping', 'they are not sleeping']
>>> list(all_choices('hello <stupid,smart> <people,persons,animals>'))
['hello stupid people', 'hello stupid persons', 'hello stupid animals', 'hello smart people', 'hello smart persons', 'hello smart animals']
"""
tokens = (block2 for block1 in text.split('<')
for block2 in block1.split('>'))
decisions = []
literals = []
try:
while True:
literal = next(tokens)
literals.append(literal)
options = next(tokens).split(',')
decisions.append(options)
except StopIteration:
pass
decisions.append(('',))
for choices in itertools.product(*decisions):
yield ''.join(x for pair in zip(literals, choices)
for x in pair)

我借用了Dennis的doctests,但这里有一个使用re.split()的公式:(

import itertools
import re

def all_choices(text):
"""
>>> list(all_choices('they <are,are not> sleeping'))
['they are sleeping', 'they are not sleeping']
>>> list(all_choices('hello <stupid,smart> <people,persons,animals>'))
['hello stupid people', 'hello stupid persons', 'hello stupid animals', 'hello smart people', 'hello smart persons', 'hello smart animals']
"""
# Split the text into choice and non-choice bits
# Wrapping the pattern in parentheses makes re.split also return the literal parts.
# Since there is only 1 group, the odd indices in bits will be the choice bits,
bits = re.split("(<.+?>)", text)
# ... but we can just as well peek into the bits themselves to find that out:
# This splits each bit into an iterable; an 1-tuple for literal parts, a list for choices.
bits = [
bit.strip("<>").split(",")
if bit.startswith("<") and bit.endswith(">")
else (bit,)
for bit in bits
]
# Itertools.product generates all combinations...
for choices in itertools.product(*bits):
# ... which we can join into a string and yield.
yield "".join(choices)

您可以使用regex:

import re
s = 'hello <stupid,smart> <people,persons,animals>'
def all_choices(my_string):
str_list = [my_string]
for _ in range(len(re.findall(r'<(.*?)>', my_string))):
tmp_list = []
for x in str_list:
mask = re.findall(r'<(.*?)>', x)[0]
for i in re.findall(mask, x)[0].split(','):
tmp_list.append(re.sub(f'<({mask})>', i, x))
str_list = tmp_list
return str_list
print(all_choices(s))
# ['hello stupid people', 'hello stupid persons', 'hello stupid animals', 'hello smart people', 'hello smart persons', 'hello smart animals']

最新更新