如何使内部使用lambda for循环?

我有这样的list_a和string_tmp

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'

我想知道list_a中是否有string_tmp项，如果有，type = L1elsetype = L2?

# for example
type = ''
for k in string_tmp.split():
if k in list_a:
type = 'L1'
if len(type) == 0:
type = 'L2'

这是真正的问题，但在我的项目中，len(list_a) = 200,000和len(strgin_tmp) = 10,000，所以我需要超级快

# this is the output of the example 
type = 'L1'

将引用列表和字符串标记转换为集合应该可以提高性能。像这样:

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'
def get_type(s, r): # s is the string, r is the reference list
s = set(s.split())
r = set(r)
return 'L1' if any(map(lambda x: x in r, s)) else 'L2'
print(get_type(string_tmp, list_a))

输出:

L1

使用正则表达式和列表推导，我们可以尝试:

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'
output = ['L1' if re.search(r'b' + x + r'b', string_tmp) else 'L2' for x in list_a]
print(output)  # ['L1', 'L2', 'L2']

效率取决于两种投入中哪一种是最不变的。例如，如果list_a保持不变，但是您有不同的字符串要测试，那么可能值得将该列表转换为正则表达式，然后将其用于不同的字符串。

这是一个解决方案，您为给定列表创建一个类的实例。然后对不同的字符串重复使用此实例:

import re
class Matcher:
def __init__(self, lst):
self.regex = re.compile(r"b(" + "|".join(re.escape(key) for key in lst) + r")b")
def typeof(self, s):
return "L1" if self.regex.search(s) else "L2"
# demo
list_a = ['AA', 'BB', 'CC']
matcher = Matcher(list_a)
string_tmp = 'Hi AA How Are You'
print(matcher.typeof(string_tmp))  # L1
string_tmp = 'Hi DD How Are You'
print(matcher.typeof(string_tmp))  # L2

这个正则表达式的一个副作用是，当单词附近有标点符号时，它也会进行匹配。例如，上面的代码仍然会返回" 01 "当字符串是'Hi AA, How Are You'(加逗号)。

相关内容

最新更新

热门标签：