我想通过根据分配给每个单词的标签将单词分配给节点来为句子创建一个图表。如果单词是专有名词,则将其分配给主语
列表,如果单词是名词,则将其分配给对象列表,如果单词是动词,则将其分配给动词列表。我在Jupyter Notebook中使用Python 2.7。
sentence_list=['Arun Mehta drinks milk']
tag_list={'Arun':'NP','Mehta':'NP','drinks':'VF','milk':'NN'}
tag_list_keys = tag_list.keys()
subject_list=[]
object_list=[]
verb_list=[]
def classify(item):
if item in tag_list_keys:
if tag_list[item] == 'NP': subject_list.append(item)
if tag_list[item] == 'NN': object_list.append(item)
if tag_list[item] == 'VF': verb_list.append(item)
def extract(item):
item_split = item.split(' ')
map(classify, item_split)
map(extract, sentence_list)
print('SUBJECT:',subject_list)
print('OBJECT',object_list)
print('VERB',verb_list)
%matplotlib notebook
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
for i in range(3):
G.add_node(object_list[i])
G.add_node(verb_list[i])
G.add_node(subject_list[i])
G.add_edge(verb_list[i],object_list[i])
G.add_edge(subject_list[i],verb_list[i])
nx.draw(G, with_labels= True)
plt.show()
预期输出应有三个节点,包括一个节点的"Arun Mehta",第二个节点中的"饮料"和第三个节点中的"牛奶"。有人可以建议需要做什么才能在一个节点中获得两个或多个单词吗?
在你的extract
方法中,你在每个空间进行拆分。这就是为什么你的图表中只有一个单词的原因。您可能需要检查两个相邻的单词是否是主题,然后再次连接它们。
为了回答您的基本问题,networkx 支持
import networkx as nx
G = nx.Graph()
G.add_node('Arun Mehta')
print(G.nodes)
输出:['Arun Mehta']
您的代码以连接两个相邻的主题,并对其进行了一点修改以使用 python 3
sentence_list=['Arun Mehta drinks milk']
tag_list={'Arun':'NP','Mehta':'NP','drinks':'VF','milk':'NN'}
tag_list_keys = tag_list.keys()
subject_list=[]
object_list=[]
verb_list=[]
list_by_tag = {'NP':subject_list,'NN':object_list, 'VF':verb_list}
def classify(items):
last_tag = tag_list[items[0]]
complete_item = items[0]
for item in items[1:]:
current_tag = tag_list[item]
if current_tag == last_tag:
complete_item = item + " " + complete_item
else:
# append last item
list_by_tag[last_tag].append(complete_item)
# save current item and tag
complete_item = item
last_tag = current_tag
# care about last element of the list
list_by_tag[last_tag].append(complete_item)
def extract(item):
item_split = item.split(' ')
classify(item_split)
list(map(extract, sentence_list))
print('SUBJECT:',subject_list)
print('OBJECT',object_list)
print('VERB',verb_list)
%matplotlib notebook
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
for i in range(1):
G.add_node(object_list[i])
G.add_node(verb_list[i])
G.add_node(subject_list[i])
G.add_edge(verb_list[i],object_list[i])
G.add_edge(subject_list[i],verb_list[i])
nx.draw(G, with_labels= True)
plt.show()