循环浏览两个列表,查找第1个列表中的元素是否存在于第2个列表中



我有两个列表。一个是语言列表,第二个是字符串列表。我想搜索文本列表中是否存在任何语言,并将其(找到的语言(附加到新列表中,否则,将"english"附加到该新列表中。

languages = ['afrikaans', 'russian', 'amharic', 'japanese', 'armenian', 'polish', ...]
texts = ['apple', 'orange in polish', 'grape in russian']

所需输出:

['english', 'polish', 'russian']

我第一次尝试这些行,但它返回['polish', 'russian']

list_of_valid_langs = []
for lang in langs:
for text in texts:
if lang in text:
list_of_valid_langs.append(lang)

对于我的第二次尝试,我添加了第二个条件,但这不是我需要的

list_of_valid_langs = []
for lang in langs:
for text in texts:
if lang in text:
list_of_valid_langs.append(lang)
elif lang not in text:
list_of_valid_langs.append('english')

我认为您的错误是首先迭代语言,然后迭代文本。让我们试着翻转一下:

for text in texts:
for lang in langs:
if lang in text:
list_of_valid_langs.append(lang)
break  # lang is found, no need to keep searching
else:  # if no lang was found, append 'english'
list_of_valid_langs.append('english')

在看到@fsimonjetz的答案后,我发现了一个使用集合的更好的解决方案:

# first of all, turn langs into a set
langs = set(langs)
# iterate over the texts
for text in texts:
# check if one of the words in the text is a language
for word in text.split():
if word in langs:
# if a language is found, append it and break
list_of_valid_langs.append(word)
break
else:
# if no language is found, append 'english'
list_of_valid_langs.append('english')

关于for else的注意事项:for循环中的代码照常运行,但else块中的代码只有在for循环正常退出时才运行。另一种方法是,else块只有在未到达break语句时才运行
如果需要,可以使用bool变量将for else替换为一个普通的for循环,然后是一个If块。

这应该有效:

for text in texts:
lang_found = False
for lang in langs:
if lang in text:
list_of_valid_langs.append(lang)
lang_found = True

if not lang_found:
list_of_valid_langs.append('english')

我认为Roy Cohen的答案是解决您问题的完美方案,但我想建议一种使用集交集的更有效的替代方案:

languages = set(['afrikaans', 'russian', 'amharic', 'japanese', 'armenian', 'polish'])
texts = ['apple', 'orange in polish', 'grape in russian']
list_of_valid_langs = []
for t in texts:
# this will return the set of the language(s) occurring in the
# string if there are any, otherwise it returns {'english'}
lang = set(t.split()).intersection(languages) or {'english'}
# pop the element from the set and append to the list
list_of_valid_langs.append(lang.pop())

最新更新