如何使用ElasticSearch在我键入时建议(结果)城市



我是Elasticsearch的新手,我花了几个小时试图解决这个问题,所以,如果你能帮助我,提前感谢。

:)(不是太(简短解释:(到目前为止我所拥有的以及我试图实现的目标(:

我创建了一个CouchDB数据库(spain_locales(,其中包含超过8000个西班牙城市和省份的文档。另一方面,我有一个带有jQuery自动完成功能的HTML表单,我在键入时显示结果。我从我创建的PHP(Laravel服务提供商(连接到ElasticSearch,并返回jQuery自动完成的结果。我想这可以通过从客户端直接连接到 ElasticSearch 来实现,但出于安全原因,我现在更喜欢这样。

:(问题:

我从 ElasticSearch 得到的结果并不完全是我所期望的,我不知道如何修复我所拥有的东西,或者这是否是正确的方法。我不知道布尔查询是针对我需要的,还是我应该使用其他类型的查询。

  1. 只有当我输入与数据库中完全相同的单词时,我才会得到结果:

    如果我输入 Álava,我会得到结果,但不会得到 Alava 的结果(Á 重音会有所不同(

  2. 直到我输入完整的单词,我才获得结果:

    如果我输入阿尔巴塞特,

    我会得到结果,但不是阿尔巴塞特

我使用 CouchDB River Plugin for ElasticSearch

将 CouchDB 与 ElasticSearch>> https://github.com/elasticsearch/elasticsearch-river-couchdb 同步,并使用以下命令 torough 终端进行了同步:

curl -XPUT 'localhost:9200/_river/spain_locales/_meta' -d '{
    "type" : "couchdb",
    "couchdb" : {
        "host" : "localhost",
        "port" : 5984,
        "db" : "spain_locales",
        "filter" : null
    },
    "index" : {
        "index" : "spain_locales",
        "type" : "spain_locales",
        "bulk_size" : "100",
        "bulk_timeout" : "10ms"
    }
}'

我还尝试过:

curl -XPUT 'localhost:9200/_river/spain_locales/_meta' -d '{
    "type" : "couchdb",
    "couchdb" : {
        "host" : "localhost",
        "port" : 5984,
        "db" : "spain_locales",
        "filter" : null
    },
    "index" : {
        "number_of_shards" : 2,
        "refresh_interval" : "1s",
        "analysis": {
          "analyzer": {
            "folding": {
              "tokenizer": "standard",
              "filter":  [ "lowercase", "asciifolding" ]
            }
          }
        },
        "index" : "spain_locales",
        "type" : "spain_locales",
        "bulk_size" : "100",
        "bulk_timeout" : "10ms"
    }
}'

以上都没有返回任何错误并成功创建_river同步,但仍存在重音和整个单词问题。

我还尝试通过以下命令以某种方式通过终端应用所需的过滤器:

curl -XPUT 'localhost:9200/spain_locales/' -d '
{
  "settings": {
    "analysis": {
      "analyzer": {
        "folding": {
          "tokenizer": "standard",
          "filter":  [ "lowercase", "asciifolding" ]
        }
      }
    }
  },
  "uuid":"KwKrBc3uQoG5Ld1nOdc5rQ"
}'

但是我收到以下错误:

{"error":"IndexAlreadyExistsException[[spain_locales] already exists]","status":400}

CouchDB 文档示例:

{
   "_id": "1",
   "_rev": "1-087ddbe8593f68f1d7d37a9c3f6de787",
   "Provincia": "Álava",
   "Poblacion": "Alegría-Dulantzi",
   "helper": ""
}
{
   "_id": "10",
   "_rev": "1-ce38dcdabeb3b34d34d2296c6e2fdf24",
   "Provincia": "Álava",
   "Poblacion": "Ayala/Aiara",
   "helper": ""
}
{
   "_id": "100",
   "_rev": "1-72e66601e378ee48519aa93601dc0717",
   "Provincia": "Albacete",
   "Poblacion": "Herrera (La)",
   "helper": "La Herrera"
}

PHP 服务提供商/控制器:

public function searchzones(){
    $q = (Input::has('term')) ? Input::get('term') : 'null';
    $params['index'] = 'spain_locales';
    $params['type']  = 'spain_locales';
    $params['body']['query']['bool']['should'] = array(
        array('match' => array('Poblacion' =>  $q)),
        array('match' => array('Provincia' =>  $q))
    );
    $query = $this->elasticsearch->search($params);
    if ($query['hits']['total'] >= 1){
        $results = $query['hits']['hits'];
        foreach ($results as $zone) {
            
            $databag[] = array( "value"     => $zone['_source']['Poblacion'].', '.$zone['_source']['Provincia'],
                                "state"     => $zone['_source']['Provincia'],
                                "city"      => $zone['_source']['Poblacion'],
            );
        }
    } else {
        $results = ['res' => null];
        $databag[] = array();
    }
    return $databag;
    } // End Search Zones

jQuery (JavaScript(:

// Sugest locations when user type in zones 
$(document).ready(function() {
    $('#zones').autocomplete({
            
            source : applink + 'ajax/searchzones',
            select : function(event, ui){
                console.log(ui);
            }
                
    }); // End autocomplete
}); // End Document ready

HTML 表单部分(Twitter Bootstrap(:

<div class="form-group">
<div class="input-group input-append dropdown">
<input type="text" class="form-control typeahead" placeholder="City name" name="zones" id="zones">
<div class="input-group-btn" >
<button type="button" class="btn btn-default dropdown-toggle" data-toggle="dropdown"><span class="caret"></span></button>
<ul class="dropdown-menu dropdown-menu-right" id="dropZonesAjax">                           
</ul>
</div>
</div>
<div id="zonesAjax"></div>   
</div>

我找到了以下资源:http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/asciifolding-token-filter.html 但我不知道如何实现/实现它。

非常感谢您的时间和尝试帮助!对不起我的英语!

尝试在编制索引之前创建映射。然后,您可以定义您提到的分析器(折叠(并将其分配给您的字段:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "folding": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "locales": {
      "properties": {
        "Provincia": {
          "type": "string",
          "analyzer": "folding"
        },
        "Poblacion": {
          "type": "string",
          "analyzer": "folding"
        },
        "helper": {
          "type": "string"
        }
      }
    }
  }
}

最新更新