空格只删除组织和人名



我已经编写了下面的函数,从文本中删除所有命名实体。如何修改它以仅删除组织和人员名称?我不想从下面的$6中去掉6。由于

import spacy
sp = spacy.load('en_core_web_sm')
def NER_removal(text):
document = sp(text)

text_no_namedentities = []

ents = [e.text for e in document.ents]
for item in document:
if item.text in ents:
pass
else:
text_no_namedentities.append(item.text)
return (" ".join(text_no_namedentities))

NER_removal("John loves to play at Sofi stadium at 6.00 PM and he earns $6")
'loves to play at stadium at 6.00 PM and he earns $'

我认为item.ent_type_在这里会很有用。

import spacy
sp = spacy.load('en_core_web_sm')
def NER_removal(text):
document = sp(text)
text_no_namedentities = []
# define ent types not to remove
ent_types_to_stay = ["MONEY"]
ents = [e.text for e in document.ents]
for item in document:
# add condition to leave defined ent types
if all((item.text in ents, item.ent_type_ not in ent_types_to_stay)):
pass
else:
text_no_namedentities.append(item.text)
return (" ".join(text_no_namedentities))
print(NER_removal("John loves to play at Sofi stadium at 6.00 PM and he earns $6"))
# loves to play at Sofi stadium at 6.00 PM and he earns $ 6

相关内容

  • 没有找到相关文章

最新更新