如何实现Python中有限字符串的BM25F,VSM或POS标签



我有一个文本文件,其中只有35个字符串我想在文本文件中找到最相关的字符串。我如何实现BM25F,VSM或POS来找到它?p>例如

Panoramio Bahawalpur
... - Bahawalpur - Picture of Bahawalpur, Punjab Province - TripAdvisor
... Minister Syed Yousaf Raza Gillaniu00e2u20acu2122s short visit to 
Bahawalpur
Bahawalpur Station Pictures - Pakistan in Photos
Noor Mahal Station , Bahawalpur Railway Station | Noor Mahal the italian style palac ...
Bahawalpur Railway Pakistan
Nur Mehal, Bahawalpur  

给定输入是 Bahawalpur火车站

如何找到最合适/相关的字符串?

这是非常简单的任务,您可以使用

from difflib import SequenceMatcher

它将返回您的字符串与

匹配多少的百分比
def similar(a, b):
  return SequenceMatcher(None, a, b).ratio()
str = "This is hello-hi image"
print "The score of relevancy is :", similar("Hello",str) * 100 ,""

您可以根据要求更改结果。谢谢

相关内容

  • 没有找到相关文章

最新更新