如何使用DSL查询在弹性搜索中匹配精确的文档数据



我的标记器

"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 10,
"token_chars": [
"letter",
"digit"
]
}

我试图基于这些字段搜索值,但这里的问题是,无论何时,我都想基于令牌搜索,比如假设如果我用s令牌搜索,那么我应该得到匹配或开始于s的项目,现在如果我用sp搜索,我想得到从sp开始的项目,丢弃其他东西,我只想得到以sp开始的值,然后丢弃所有,我没有得到是我的查询错误还是我使用了错误的过滤器有人能帮我处理这个吗

{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "PRODUCT",
"fields": [
"item",
"data1"
]
}
},
{
"multi_match": {
"query": "SUB_FAMILY",
"fields": [
"item",
"data1"
]
}
},
{
"match": {
"values": "SP"
}
}
]
}
}
}

此查询的输出为

"hits": [
{
"_index": "logs_datas",
"_type": "_doc",
"_id": "H1PfEnkBQXpKNrJSp8bV",
"_score": 9.418445,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
"path": "/home/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.578Z",
"host": "ewiglp71",
"item_pk": "SPRINHO2H",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "logs_datas",
"_type": "_doc",
"_id": "y1PfEnkBQXpKNrJSp8XQ",
"_score": 5.3059187,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SCMLPLWVI",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.577Z",
"host": "ewiglp71",
"item_pk": "SCMLPLWVI",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "logs_datas",
"_type": "_doc",
"_id": "zFPfEnkBQXpKNrJSp8XQ",
"_score": 5.3059187,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SSVRKEN2Z",
"path": "/home/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.579Z",
"host": "ewiglp71",
"item_pk": "SSVRKEN2Z",
"attribute_name": "SUB_FAMILY"
}
}
}
]
}
}

由于min_gram是1,因此为SCMLPLWVI生成的令牌将是

{
"tokens": [
{
"token": "S",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 0
},
{
"token": "SC",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "SCM",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "SCML",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 3
},
{
"token": "SCMLP",
"start_offset": 0,
"end_offset": 5,
"type": "word",
"position": 4
},
{
"token": "SCMLPL",
"start_offset": 0,
"end_offset": 6,
"type": "word",
"position": 5
},
{
"token": "SCMLPLW",
"start_offset": 0,
"end_offset": 7,
"type": "word",
"position": 6
},
{
"token": "SCMLPLWV",
"start_offset": 0,
"end_offset": 8,
"type": "word",
"position": 7
},
{
"token": "SCMLPLWVI",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 8
}
]
}

如果你想获得以sp开始的值,那么你需要将你的标记器修改为

"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,          // note this
"max_gram": 10,
"token_chars": [
"letter",
"digit"
]
}

更新1:

您可以使用match_bool_prefix搜索以ssp开头的单词

添加工作示例

索引映射:

{
"mappings": {
"properties": {
"item_pk": {
"type": "text"
}
}
}
}

搜索查询1:

{
"query": {
"match_bool_prefix" : {
"item_pk" : "s"
}
}
}

搜索结果将是

"hits": [
{
"_index": "67281810",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.578Z",
"host": "ewiglp71",
"item_pk": "SPRINHO2H",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "67281810",
"_type": "_doc",
"_id": "i7quE3kB6jKCA-nFYii6",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SCMLPLWVI",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.577Z",
"host": "ewiglp71",
"item_pk": "SCMLPLWVI",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "67281810",
"_type": "_doc",
"_id": "jLquE3kB6jKCA-nFgiju",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SSVRKEN2Z",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.579Z",
"host": "ewiglp71",
"item_pk": "SSVRKEN2Z",
"attribute_name": "SUB_FAMILY"
}
}
]

搜索查询2:

{
"query": {
"match_bool_prefix" : {
"item_pk" : "sp"
}
}
}

搜索结果:

"hits": [
{
"_index": "67281810",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.578Z",
"host": "ewiglp71",
"item_pk": "SPRINHO2H",
"attribute_name": "SUB_FAMILY"
}
}
]

更新2:

尝试使用此查询

{
"query": {
"bool": {
"must": [
{
"match": {
"hierarchy_name": "PRODUCT"
}
},
{
"match": {
"attribute_name": "SUB_FAMILY"
}
},
{
"match_bool_prefix": {
"item_pk": "sp"
}
}
]
}
}
}

最新更新