正则表达式:如何检查文本是否至少包含字符集中的所有字母

我有一个字符串（本质上是缩写 - 例如，美国，所有字母在首都）和文本列表。我想选择包含字符串中所有字母的文本（案例敏感匹配）。例如，

string = "USA"
texts = ["United States of America", "United States", "United States of America and Iraq"]
#Result shoud be:
results = ["United States of America", "United States of America and Iraq"]

我已经尝试使用(?=U)(?=S)(?=A)（这是重复问题的答案提出的答案），但这似乎不起作用，因为Regex期望这些字母以精确的顺序出现。另外，我不想检查每个首都的小字母和空格，即[?=U]([a-zA-Z]*[s]+)*[?=S]([a-zA-Z]*[s]+)*[?=A][a-zA-Z]*，因为这些简单是多余的（同时无法完美工作）。

我正在寻找的是使用等同于[USA]的表达式 - 它执行或操作以选择包含至少一个字符串字母的文本。是否有任何表达方式优雅，可以在以至于以下方面进行"one_answers"操作？

您可能正在寻找all()与in结合使用：

string = "USA"
texts = ["United States of America", "United States", "United States of America and Iraq", "Germany"]
vector = [all([x for c in string for x in [c in text]]) for text in texts]

这产生

[True, False, True, False]

因此，与filter()结合使用您不需要任何正则表达式：

new_text = list(
    filter(
        lambda text: all([x for c in string for x in [c in text]]),
        texts
    )
)
print(new_text)

后者产生

['United States of America', 'United States of America and Iraq']

相关内容

最新更新

热门标签：