我需要从中提取一个字符串:
Tom and Jerry, Batman and Joker, Homer and Marge
这样的例子不胜枚举……这是我希望得到的最终结果输出(保存为CSV或其他东西):
|Tom|
|Jerry|
|Tom and Jerry|
|Batman|
|Joker|
|Batman and Joker|
|Homer|
|Marge|
|Homer and Marge|
我知道我可以通过.split(",")
得到Tom and Jerry
和.split("and")
来进一步分离Tom, Jerry。
但是,我怎样才能保留这三个记录呢?
str.split
返回list instance
,list instance
没有拆分功能。每个不同的变量都需要利用单个函数的执行结果。
text = "Tom and Jerry, Batman and Joker, Homer and Marge"
result = list()
for text_and in text.split(', '):
if ' and ' in text_and: # If 'and' doesn't exist in some of input data,
for text_name in text_and.split(' and '):
print(f"|{text_name}|")
result.append(text_name)
print(f"|{text_and}|")
result.append(text_and)
|Tom|
|Jerry|
|Tom and Jerry|
|Batman|
|Joker|
|Batman and Joker|
|Homer|
|Marge|
|Homer and Marge|
下面是使用itertools.chain
函数的一行代码。
from itertools import chain
result = list(chain(*[[*text_and.split(' and '), text_and] if ' and ' in text_and else [text_and] for text_and in text.split(', ')]))
# result
['Tom', 'Jerry', 'Tom and Jerry', 'Batman', 'Joker', 'Batman and Joker', 'Homer', 'Marge', 'Homer and Marge']