如何多次从带有关键字开始和结束的列表获取子列表



我有一个列表,例如['tree', 'water', 'dog', 'soap', 'cat', 'bird', 'tree', 'dog', 'water']我想要一个函数,它接受任何带有开始词和结束词的列表,并给我它们的子列表。我希望它在整个列表中搜索开始和结束单词。因此,如果我的起始词是树,结束词是水,它会给我两个列表,包括['tree', water']['tree', 'dog', 'water'].如果没有单词树或水,那么函数应该跳过它。

我试过做

def sublist(words, start, end):
start_index = words.index(start)
end_index = words.index(end)
sublist = words[start_index:end_index+1]
words = ['tree', 'water', 'dog', 'soap', 'cat', 'bird', 'tree', 'dog', 'water']
start = 'tree'
end = 'water'
sublist(words, start, end)

但这只返回一个['tree', 'water']列表.我想['tree,' water]['tree', 'dog', 'water'],找到第一段后,我不知道如何继续。如果开始或结束单词不在列表中,也会出错。

你可以一直index,直到失败:

def sublists(words, start, end):
end_index = -1
try:
while True:
start_index = words.index(start, end_index + 1)
end_index = words.index(end, start_index + 1)
yield words[start_index : end_index+1]
except ValueError:
pass

或者正则表达式方式,首先将列表转换为字符串,如'se----s-e'

import re
def sublists(words, start, end):
s = ''.join('s' if w == start else 'e' if w == end else '-' for w in words)
for match in re.finditer('s-*e', s):
yield words[slice(*match.span())]

或者有点辱骂的方式:

def sublists(words, start, end):
it = iter(words)
while start in it:
sub = [start]
if end in (sub.append(word) or word for word in it):
yield sub

不知道该怎么看这个版本:

from itertools import filterfalse
def sublists(words, start, end):
it = iter(words)
while start in it:
sub = [start]
if end in filterfalse(sub.append, it):
yield sub

它的非滥用版本:

def sublists(words, start, end):
it = iter(words)
while start in it:
sub = [start]
for word in it:
sub.append(word)
if word == end:
yield sub
break

也许这可以处理它:

def sublist(words, start, end):
sublists = []                           # prepare sublists for return
sublist = []                            # initialze sublist
for word in words:                      # for each word
if len(sublist):                    # sublist already has word(s)
sublist.append(word)            # add word to sublist
if word == end:                 # the list met its end
sublists.append(sublist)    # store it to sublists
sublist = []                # empty it
elif word == start:                 # met start word 
sublist.append(word)            # add 1st word to sublist
return sublists                         # return output

最新更新