我有一个文本文件,其中只有35个字符串我想在文本文件中找到最相关的字符串。我如何实现BM25F,VSM或POS来找到它?p>例如
Panoramio Bahawalpur
... - Bahawalpur - Picture of Bahawalpur, Punjab Province - TripAdvisor
... Minister Syed Yousaf Raza Gillaniu00e2u20acu2122s short visit to
Bahawalpur
Bahawalpur Station Pictures - Pakistan in Photos
Noor Mahal Station , Bahawalpur Railway Station | Noor Mahal the italian style palac ...
Bahawalpur Railway Pakistan
Nur Mehal, Bahawalpur
给定输入是 Bahawalpur火车站
如何找到最合适/相关的字符串?
这是非常简单的任务,您可以使用
from difflib import SequenceMatcher
它将返回您的字符串与
匹配多少的百分比def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
str = "This is hello-hi image"
print "The score of relevancy is :", similar("Hello",str) * 100 ,""
您可以根据要求更改结果。谢谢