ElasticSearch -无法使用模糊匹配查询搜索下划线值(ES模糊不匹配下划线值)

假设我的elasticsearch中有三个文档。例:

1: {
"name": "test_2602"
}
2: {
"name": "test-2602"
}
3: {
"name": "test 2602"
}

现在当我使用模糊匹配查询搜索它时，如下所示

{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match": {
"name": {
"query": "test-2602",
"fuzziness": "2",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"boost": 1
}
}
}
],
"disable_coord": false,
"adjust_pure_negative": true,
"boost": 1
}
}
],
"disable_coord": false,
"adjust_pure_negative": true,
"boost": 1
}
}
}

作为响应，我只得到两个文档，这是(即使我按名称值搜索=>test";test 2602"或"测试- 2602")

{
"name": "test-2602"
},
{
"name": "test 2602"
}

我没有得到名称为"test_2602"(与包含下划线的value不匹配)。我希望它包括第三个文档以及名称值为"test_2602"。但是如果我搜索name为test_2602;然后我得到

{
"name": "test_2602"
}

当我搜索name为"test" test 2602"， "test-2602"时，我需要获取这三个文档和"test_2602">

您在搜索中只得到两个文档，因为默认情况下elasticsearch使用标准分析器，它将把"test-2602"和"test 2602"标记为test和2602。但是"test_2602"不会被标记化。

您可以使用analyze API

检查生成的令牌

GET /_analyze
{
"analyzer" : "standard",
"text" : "test_2602"
}

生成的令牌将是

{
"tokens": [
{
"token": "test_2602",
"start_offset": 0,
"end_offset": 9,
"type": "<ALPHANUM>",
"position": 0
}
]
}

您需要在type字段中添加。keyword。它使用关键字分析器而不是标准分析器(注意".keyword"在名称字段之后)。试试下面的查询-

指数映射:

{
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}

搜索查询:

{
"query": {
"match": {
"name.keyword": {
"query": "test_2602",
"fuzziness":2
}
}
}
}

搜索结果:

"hits": [
{
"_index": "66572330",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"name": "test_2602"
}
},
{
"_index": "66572330",
"_type": "_doc",
"_id": "3",
"_score": 0.8718481,
"_source": {
"name": "test 2602"
}
},
{
"_index": "66572330",
"_type": "_doc",
"_id": "2",
"_score": 0.8718481,
"_source": {
"name": "test-2602"
}
}
]

相关内容

最新更新

热门标签：