spacy matcher on\ymatch调用了两次

我的问题是我从spacy文档中获取了代码

def on_match(matcher, doc, id, matches):
print("Matched!", matches)
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
patterns = [
[{"LOWER": "hello"}, {"LOWER": "world"}],
[{"ORTH": "Google"}, {"ORTH": "Maps"}],
]
matcher.add("TEST_PATTERNS", patterns, on_match=on_match)
doc = nlp("HELLO WORLD on Google Maps.")
matches = matcher(doc)

我如何才能将这些模式合并为只匹配类似"；你好，世界。。。谷歌地图"；。非常感谢。

如果添加

for match_id, start, end in matches:
print(doc[start:end].text)

输出将是

HELLO WORLD
Google Maps

因此，你有两个匹配，它们发生是因为

[{"LOWER": "hello"}, {"LOWER": "world"}]-一个返回两个连续令牌的模式，其小写值为hello和world(因此，它可以找到HELLO WORLD(
[{"ORTH": "Google"}, {"ORTH": "Maps"}]-一个返回两个连续令牌的模式，其值正好是Google和Maps(因此，它可以找到Google Maps(

相关内容

最新更新

热门标签：