在elasticsearch中使用空格搜索名称(文本)



搜索包含空格的名称(文本),给我带来问题,我有类似于

的映射
"{"user":{"properties":{"name":{"type":"string"}}}}"

理想情况下,它应该返回的结果和排序如下

1) Bring on top names that exact match the search term (highest score)
2) Names that starts with the search term (high score)
3) Names that contains the exact search term as substring (medium score)
4) Names that contains any of the search term token  (lowest score)

的例子在elasticsearch

中查找以下名称
Maaz Tariq
Ahmed Maaz Tariq
Maaz Sheeba
Maaz Bin Tariq
Sana Tariq
Maaz Tariq Ahmed

搜索"Maaz Tariq",结果顺序如下

Maaz Tariq (highest score)
Maaz Tariq Ahmed (high score)
Ahmed Maaz Tariq (medium score)
Maaz Bin Tariq  (lowest score)
Maaz Sheeba (lowest score)
Sana Tariq (lowest score)

有谁能告诉我如何使用和使用哪个分析器吗?如何对名字的搜索结果进行排序?

您可以使用多字段类型,bool查询和自定义boost factor查询来解决此问题。

映射:

{
    "mappings" : {
        "user" : {        
            "properties" : {
                "name": {
                    "type": "multi_field",
                    "fields": {
                        "name": { "type" : "string", "index": "analyzed" },
                        "exact": { "type" : "string", "index": "not_analyzed" }
                    }
                }
            }
        }
    }
}
查询:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "name": "Maaz Tariq"
                    }
                }
            ],
            "should": [
                {
                    "custom_boost_factor": {
                        "query": {
                            "term": {
                                "name.exact": "Maaz Tariq"
                            }
                        },
                        "boost_factor": 15
                    }
                },
                {
                    "custom_boost_factor": {
                        "query": {
                            "prefix": {
                                "name.exact": "Maaz Tariq"
                            }
                        },
                        "boost_factor": 10
                    }
                },
                {
                    "custom_boost_factor": {
                        "query": {
                            "match_phrase": {
                                "name": {
                                    "query": "Maaz Tariq",
                                    "slop": 0
                                }
                            }
                        },
                        "boost_factor": 5
                    }
                }
            ]
        }
    }
}
编辑:

正如javanna指出的,custom_boost_factor是不需要的。

查询不包含custom_boost_factor:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "name": "Maaz Tariq"
                    }
                }
            ],
            "should": [
                {
                    "term": {
                        "name.exact": {
                            "value": "Maaz Tariq",
                            "boost": 15
                        }
                    }
                },
                {
                    "prefix": {
                        "name.exact": {
                            "value": "Maaz Tariq",
                            "boost": 10
                        }
                    }
                },
                {
                    "match_phrase": {
                        "name": {
                            "query": "Maaz Tariq",
                            "slop": 0,
                            "boost": 5
                        }
                    }
                }
            ]
        }
    }
}

在Java Api的情况下,当查询带有空格的精确字符串时使用;

CLIENT.prepareSearch(index)
    .setQuery(QueryBuilders.queryStringQuery(wordString)
    .field(fieldName));

在许多其他查询中,您没有得到任何结果

from Elasticsearch 1.0:

"title": {
    "type": "multi_field",
    "fields": {
        "title": { "type": "string" },
        "raw":   { "type": "string", "index": "not_analyzed" }
    }
}

变成:

"title": {
    "type": "string",
    "fields": {
        "raw":   { "type": "string", "index": "not_analyzed" }
    }
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html

最新更新