wordnet替代方案,以查找单词之间的语义关系python



我有一个项目来获取两个单词之间的语义关系,我想获取词与词的关系,如同义词、下同义词、同义词、全名词。。。我尝试了wordnet nltk,但大多数关系都没有,这是示例代码:

from nltk.corpus import wordnet as wn
from wordhoard import synonyms
Word1 = 'red'
Word2 = 'color'
LSTWord1 =[]
for syn in wn.synsets(Word1):
for lemma in syn.part_meronyms():
LSTWord1.append(lemma)

for s in LSTWord1:
if Word2 in s.name() :
print(Word1 +' is meronyms  of ' +  Word2) 
break
LSTWord2 =[]
for syn in wn.synsets(Word2):
for lemma in syn.part_meronyms ():
LSTWord2.append(lemma)
for s in LSTWord2:
if Word1 in s.name() :
print( Word2   +' is meronyms  of ' + Word1)
break

这里有一个单词的例子:

scheduled ,geometry
games,river
campaign,sea
adventure,place
session,road
long,town
campaign,road
session,railway
difficulty of session,place of interest
campaign,town
leader,historic place
have,town
player,town
skills,church
campaign,cultural interest
character name,town
player,monument
player,province
games,beach
expertise level,gas station
character,municipality
world,electrict line
social interaction,municipality
world,electric line
percentage,municipality
character,hospital
inhabitants,mine
active character,municipality
campaign,altitude
died,municipality
many time,mountain
adventurer,altitude
campaign,peak
gain,place of interest
new capabilities,cultural interest
player,cultural interest
achievement,national park
campaign,good
first action,railway station
player,province

wordnet是有限的,还是词与词之间没有关系,我的问题是,除了wordnet之外,有没有其他方法可以处理词与词的语义关系,或者有没有更好的方法可以获得词与词间的语义关系?感谢

正如我之前所说,我是您在问题中使用的Python包word囤积的作者。基于您的问题,我决定在软件包中添加一些附加模块。这些模块侧重于:

  • 同音词
  • 超同义词
  • 同义词

我找不到添加同义词的简单方法,但我仍在寻找最好的方法。

同音词模块将查询一个手工创建的60000多个最常用英语单词的列表,以查找已知的同音词我计划将来扩大这个列表。

from wordhoard import Homophones
words = ['scheduled' ,'geometry', 'games', 'river', 'campaign', 'sea', 'adventure','place','session', 'road', 'long', 'town', 'campaign', 'road', 'session', 'railway']
for word in words:
homophone = Homophones(word)
results = homophone.find_homophones()
print(results)
# output 
no homophones for scheduled
no homophones for geometry
no homophones for games
no homophones for river
no homophones for campaign
['sea is a homophone of see', 'sea is a homophone of cee']
no homophones for adventure
['place is a homophone of plaice']
['session is a homophone of cession']
['road is a homophone of rowed', 'road is a homophone of rode']
truncated...

超级名称模块查询各种在线存储库

from wordhoard import Hypernyms
words = ['scheduled' ,'geometry', 'games', 'river', 'campaign','sea', 'adventure',
'place','session','road', 'long','town', 'campaign','road', 'session', 'railway']
for word in words:
hypernym = Hypernyms(word)
results = hypernym.find_hypernyms()
print(results)
# output 
['no hypernyms found']
['arrangement', 'branch of knowledge', 'branch of math', 'branch of mathematics', 'branch of maths', 'configuration', 'figure', 'form', 'math', 'mathematics', 'maths', 'pure mathematics', 'science', 'shape', 'study', 'study of numbers', 'study of quantities', 'study of shapes', 'system', 'type of geometry']
['lake', 'recreation']
['branch', 'dance', 'fresh water', 'geological feature', 'landform', 'natural ecosystem', 'natural environment', 'nature', 'physical feature', 'recreation', 'spring', 'stream', 'transportation', 'watercourse']
['action', 'actively seek election', 'activity', 'advertise', 'advertisement', 'battle', 'canvass', 'crusade', 'discuss', 'expedition', 'military operation', 'operation', 'political conflict', 'politics', 'promote', 'push', 'race', 'run', 'seek votes', 'wage war']
truncated...

同义词模块查询存储库

from wordhoard import Hyponyms
words = ['scheduled' ,'geometry', 'games', 'river', 'campaign','sea', 'adventure',
'place','session','road', 'long','town', 'campaign','road', 'session', 'railway']
for word in words:
hyponym = Hyponyms(word)
results = hyponym.find_hyponyms()
print(results)
# output 
['no hyponyms found']
['absolute geometry', 'affine geometry', 'algebraic geometry', 'analytic geometry', 'combinatorial geometry', 'descriptive geometry', 'differential geometry', 'elliptic geometry', 'euclidean geometry', 'finite geometry', 'geometry of numbers', 'hyperbolic geometry', 'non-euclidean geometry', 'perspective', 'projective geometry', 'pythagorean theorem', 'riemannian geometry', 'spherical geometry', 'taxicab geometry', 'tropical geometry']
['jack in the box', 'postseason']
['affluent', 'arkansas river', 'arno river', 'avon', 'big sioux river', 'bighorn river', 'brazos river', 'caloosahatchee river', 'cam river', 'canadian river', 'cape fear river', 'changjiang', 'chari river', 'charles river', 'chattahoochee river', 'cimarron river', 'colorado river', 'orange', 'red', 'tunguska']
['ad blitz', 'ad campaign', 'advertising campaign', 'agitating', 'anti-war movement', 'campaigning', 'candidacy', 'candidature', 'charm campaign', 'come with', 'electioneering', 'feminism', 'feminist movement', 'fund-raising campaign', 'fund-raising drive', 'fund-raising effort', 'military campaign', 'military expedition', 'political campaign', 'senate campaign']
truncated...

如果您在使用这些新模块时有任何问题,请告诉我。

看起来您正在寻找一对给定单词之间以及大型词汇表中的任意语义关系。也许单词嵌入的简单余弦相似性可以在这里有所帮助。你可以从GloVe开始。

最新更新