首页 > 解决方案 > 使用 word2vec / wordnet 的两个列表之间的余弦相似度

问题描述

可以使用 word2vec 预训练模型 (GoogleNews-vectors-negative300) 或 wordnet 获得两个单词列表之间的相似度平均值:

dic_email=['email','e-mail','address'] # dictionary of email contain word similar to email
dic_identity=['name','first','last','identity','forename']
lista = ['fullname','user'] # fullname is similar to name 
listb = ['phone','number']
resa=lista.compare(dic_email,dic_identity) #just an example of my function compare
restb=listb.compare(dic_email,dic_identity)

print('the percent of similarity is :'+resa)
print('the percent of similarity is :'+resb)

结果应该是:

the percent of similarity is : [('dic_identity',0.8),('dic_email',0.0)] # i just gives a rondom percent to explain lista is similar to dic_identity beacause fullname is similar to the item in dic_identity  
the percent of similarity is : [('dic_identity',0.0),('dic_email',0.0)]

标签: pythonword2vecsimilaritywordnetcosine-similarity

解决方案


推荐阅读