从Spacy v2.0中的令牌化句子中查找命名实体

我正在尝试做：

tokenize sentences
计算句子中每个单词的命名实体

这是我到目前为止所做的：

nlp = spacy.load('en')
sentence = "Germany and U.S.A are popular countries. I am going to gym tonight"
sentence = nlp(sentence)
tokenized_sentences = []
for sent in sentence.sents:
        tokenized_sentences.append(sent)
for s in tokenized_sentences:
        labels = [ent.label_ for ent in s.ents]
        entities = [ent.text for ent in s.ents]

错误：

    labels = [ent.label_ for ent in s.ents]
    AttributeError: 'spacy.tokens.span.Span' object has no attribute 'ents'

是否有其他方法可以找到命名句子的实体？

预先感谢

请注意，您只有两个实体 - 美国和德国。

简单版本：

sentence = nlp("Germany and U.S.A are popular countries. I am going to gym tonight")    
for ent in sentence.ents:
        print(ent.text, ent.label_)

我认为您正在绑扎：

sentence = nlp("Germany and U.S.A are popular countries. I am going to gym tonight")
for sent in sentence.sents:
    tmp = nlp(str(sent))
    for ent in tmp.ents:
        print(ent.text, ent.label_)

ents仅与DOC（spacy.tokens.doc.Doc）一起使用，如果您使用doc=nlp(text)

发送的是没有ents方法的spacy.tokens.span.Span类型。

将其转换为文本并再次使用nlp()。

print([(ent.text, ent.label_) for ent in nlp(sent.text).ents])

相关内容

最新更新

热门标签：