红宝石快速而模糊的搜索阵列很多哈希



我有一系列像这样的哈希

@t = [{"id"=>"819827", "nm"=>"Razvilka", "countryCode"=>"RU"}, 
{"id"=>"524901", "nm"=>"Moscow", "countryCode"=>"RU"}, 
{"id"=>"1271881", "nm"=>"Firozpur Jhirka", "countryCode"=>"IN"}, 
{"id"=>"1283240", "nm"=>"Kathmandu", "countryCode"=>"NP"}] # ... + 100,000 more

我可以从特定的哈希键中搜索具有

的精确拼写
@t.find {|x| x["nm"] == "Moscow"}

它将很快返回哈希。

但是,这不会说明外壳,语法或近似匹配。我该怎么做?

尝试Levenshtein Gem https://rubygems.org/gems/levenshtein

gem install levenshtein

然后在您的代码中:

require `levenshtein`
#Levenshtein.distance(a, b) < 5 # some fuzzy level
def find_levenshtein(hash, key, str)
  hash.select do |h|
    Levenshtein.distance(h[key], str) < 5
  end
end
puts find_levenshtein(t, 'nm', 'moscw').inspect
#=> [{"id"=>"524901", "nm"=>"Moscow", "lat"=>"55.752220", "lon"=>"37.615555", "countryCode"=>"RU"}]

有关更多信息,请参见https://en.wikipedia.org/wiki/levenshtein_distance

最新更新