Elasticsearch:Span_near和子字符串匹配



我是弹性搜索的新手。我想实现span near的功能,它还负责精确短语匹配和精确单词序列匹配之后的子字符串匹配。

例如:

我在索引上的文件:

  1. 男士面霜
  2. 男士皱纹膏
  3. 男士高级防皱霜
  4. 女式奶油
  5. 女式皱纹膏
  6. 女式高级皱纹霜

如果我搜索"男士面霜",我希望结果与上面显示的顺序相同。预期搜索结果:

  1. 男士奶油-->精确短语匹配
  2. 男士皱纹膏-->用slop 1检索词序列
  3. 男士高级防皱霜-->用slop 2检索词序列
  4. 女式奶油-->接近短语匹配的子字符串
  5. 女性皱纹膏-->用slop 1子串搜索词序列
  6. 女性高级皱纹膏-->slop 2子串搜索词序列

span_near具有与slop = 2in_order = true嵌套的span_terms的情况下,我可以实现前3个结果
我无法在剩余的4到6个时间内实现它,因为span_near具有嵌套的span_terms不支持wildcard,在本例中为"男士膏"或"男士膏"。有什么方法可以使用ELASTICSEARCH实现它吗

更新
我的索引:

{
  "bluray": {
    "settings": {
      "index": {
        "uuid": "4jofvNfuQdqbhfaF2ibyhQ",
        "number_of_replicas": "1",
        "number_of_shards": "5",
        "version": {
          "created": "1000199"
        }
      }
    }
  }
}

映射:

{
  "bluray": {
    "mappings": {
      "movies": {
        "properties": {
          "genre": {
            "type": "string"
          }
        }
      }
    }
  }
}

我正在运行以下查询:

POST /bluray/movies/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "span_near": {
            "clauses": [
              {
                "span_term": {
                  "genre": "women"
                }
              },
              {
                "span_term": {
                  "genre": "cream"
                }
              }
            ],
            "collect_payloads": false,
            "slop": 12,
            "in_order": true
          }
        },
        {
          "custom_boost_factor": {
            "query": {
              "match_phrase": {
                "genre": "women cream"
              }
            },
            "boost_factor": 4.1
          }
        },
        {
          "match": {
            "genre": {
              "query": "women cream",
              "analyzer": "standard",
              "minimum_should_match": "99%"
            }
          }
        }
      ]
    }
  }
}

它给了我以下结果:

"took": 3,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 6,
      "max_score": 0.011612939,
      "hits": [
         {
            "_index": "bluray",
            "_type": "movies",
            "_id": "u9aNkZAoR86uAiW9SX8szQ",
            "_score": 0.011612939,
            "_source": {
               "genre": "men's cream"
            }
         },
         {
            "_index": "bluray",
            "_type": "movies",
            "_id": "cpTyKrL6TWuJkXvliibVBQ",
            "_score": 0.009290351,
            "_source": {
               "genre": "men's wrinkle cream"
            }
         },
         {
            "_index": "bluray",
            "_type": "movies",
            "_id": "rn_SFvD4QBO6TJQJNuOh5A",
            "_score": 0.009290351,
            "_source": {
               "genre": "men's advanced wrinkle cream"
            }
         },
         {
            "_index": "bluray",
            "_type": "movies",
            "_id": "9a31_bRpR2WfWh_4fgsi_g",
            "_score": 0.004618556,
            "_source": {
               "genre": "women's cream"
            }
         },
         {
            "_index": "bluray",
            "_type": "movies",
            "_id": "q-DoBBl2RsON_qwLRSoh9Q",
            "_score": 0.0036948444,
            "_source": {
               "genre": "women's advanced wrinkle cream"
            }
         },
         {
            "_index": "bluray",
            "_type": "movies",
            "_id": "TxzCP8B_Q8epXtIcfgEw3Q",
            "_score": 0.0036948444,
            "_source": {
               "genre": "women's wrinkle cream"
            }
         }
      ]
   }
}

这根本不正确。为什么它会先搜索男性,而我已经搜索了女性。

注意:搜索"男士面霜"仍然会返回更好的结果,但不会遵循搜索词的顺序。

POST /bluray/movies/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "span_near": {
            "clauses": [
              {
                "span_term": {
                  "genre": "women's"
                }
              },
              {
                "span_term": {
                  "genre": "cream"
                }
              }
            ],
            "collect_payloads": false,
            "slop": 12,
            "in_order": true
          }
        },{
          "match": {
            "genre": {
              "query": "women's cream",
              "analyzer": "standard",
              "minimum_should_match": "99%"
            }
          }
        }
      ]
    }
  }
}

它给出了如下输出,正如您所期望的:

    {
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 6,
    "max_score": 0.7841132,
    "hits": [
      {
        "_index": "bluray",
        "_type": "movies",
        "_id": "4",
        "_score": 0.7841132,
        "_source": {
          "genre": "women's cream"
        }
      },
      {
        "_index": "bluray",
        "_type": "movies",
        "_id": "5",
        "_score": 0.56961054,
        "_source": {
          "genre": "women's wrinkle cream"
        }
      },
      {
        "_index": "bluray",
        "_type": "movies",
        "_id": "6",
        "_score": 0.35892165,
        "_source": {
          "genre": "women's advanced wrinkle cream"
        }
      },
      {
        "_index": "bluray",
        "_type": "movies",
        "_id": "3",
        "_score": 0.2876821,
        "_source": {
          "genre": "men's advanced wrinkle cream"
        }
      },
      {
        "_index": "bluray",
        "_type": "movies",
        "_id": "1",
        "_score": 0.25811607,
        "_source": {
          "genre": "men's cream"
        }
      },
      {
        "_index": "bluray",
        "_type": "movies",
        "_id": "2",
        "_score": 0.11750762,
        "_source": {
          "genre": "men's wrinkle cream"
        }
      }
    ]
  }
}

相关内容

  • 没有找到相关文章

最新更新