我有一个列表,例如['tree', 'water', 'dog', 'soap', 'cat', 'bird', 'tree', 'dog', 'water']
我想要一个函数,它接受任何带有开始词和结束词的列表,并给我它们的子列表。我希望它在整个列表中搜索开始和结束单词。因此,如果我的起始词是树,结束词是水,它会给我两个列表,包括['tree', water']
和['tree', 'dog', 'water']
.如果没有单词树或水,那么函数应该跳过它。
我试过做
def sublist(words, start, end):
start_index = words.index(start)
end_index = words.index(end)
sublist = words[start_index:end_index+1]
words = ['tree', 'water', 'dog', 'soap', 'cat', 'bird', 'tree', 'dog', 'water']
start = 'tree'
end = 'water'
sublist(words, start, end)
但这只返回一个['tree', 'water']
列表.我想['tree,' water]
,['tree', 'dog', 'water']
,找到第一段后,我不知道如何继续。如果开始或结束单词不在列表中,也会出错。
你可以一直index
,直到失败:
def sublists(words, start, end):
end_index = -1
try:
while True:
start_index = words.index(start, end_index + 1)
end_index = words.index(end, start_index + 1)
yield words[start_index : end_index+1]
except ValueError:
pass
或者正则表达式方式,首先将列表转换为字符串,如'se----s-e'
:
import re
def sublists(words, start, end):
s = ''.join('s' if w == start else 'e' if w == end else '-' for w in words)
for match in re.finditer('s-*e', s):
yield words[slice(*match.span())]
或者有点辱骂的方式:
def sublists(words, start, end):
it = iter(words)
while start in it:
sub = [start]
if end in (sub.append(word) or word for word in it):
yield sub
不知道该怎么看这个版本:
from itertools import filterfalse
def sublists(words, start, end):
it = iter(words)
while start in it:
sub = [start]
if end in filterfalse(sub.append, it):
yield sub
它的非滥用版本:
def sublists(words, start, end):
it = iter(words)
while start in it:
sub = [start]
for word in it:
sub.append(word)
if word == end:
yield sub
break
也许这可以处理它:
def sublist(words, start, end):
sublists = [] # prepare sublists for return
sublist = [] # initialze sublist
for word in words: # for each word
if len(sublist): # sublist already has word(s)
sublist.append(word) # add word to sublist
if word == end: # the list met its end
sublists.append(sublist) # store it to sublists
sublist = [] # empty it
elif word == start: # met start word
sublist.append(word) # add 1st word to sublist
return sublists # return output