我正在使用这个代码:
"
c=[]
for i, j in article.iterrows():
c.append(j)
d=[]
for i in c:
e={}
e['Urls']=(i[0])
a = str(i[2])
doc = ner(a)
for ent in doc.ents:
e[ent.label_]=(ent.text)
d.append(e)
"
我的输出看起来像这样:
[{'Urls': 'https://somewebsite.com',
'Fruit': 'Apple',
'Fruit_colour': 'Red'},
{'Urls': 'Urls': 'https://some_other_website.com/',
'Fruit': 'Papaya',
'Fruit_Colour': 'Yellow'}
我有多个值水果,愿望输出看起来像:{"url":"https://somewebsite.com"水果":"苹果","水果":"橙色","水果":"西瓜","Fruit_colour":"红","Fruit_colour":"橙色","Fruit_colour":"绿色"}
{'Urls': 'Urls': 'https://some_other_website.com/',
'Fruit': 'Papaya',
'Fruit': 'Peach',
'Fruit': Mango'
'Fruit_Colour': 'Yellow',
'Fruit_Colour': 'Yellow
'Fruit_Colour': 'Green'}
感谢您的帮助和时间。
听起来您想在一个键中保存多个值。您可以使用defaultdict
和list
s。
from collections import defaultdict
out = defaultdict(list)
doc = ... get it from spaCy ...
for ent in doc.ents:
out[ent.label_].append(ent.text)
print(out)