当字段包含~时，过滤弹性搜索数据

我有一堆文档，如下所示。我想过滤projectkey以~开头的数据。我确实读过一些文章，说~是弹性查询中的一个运算符，所以不能用它来过滤。有人能帮助形成/branch/_sarch API的搜索查询吗？？

{
"_index": "branch",
"_type": "_doc",
"_id": "GAz-inQBJWWbwa_v-l9e",
"_version": 1,
"_score": null,
"_source": {
"branchID": "refs/heads/feature/12345",
"displayID": "feature/12345",
"date": "2020-09-14T05:03:20.137Z",
"projectKey": "~user",
"repoKey": "deploy",
"isDefaultBranch": false,
"eventStatus": "CREATED",
"user": "user"
},
"fields": {
"date": [
"2020-09-14T05:03:20.137Z"
]
},
"highlight": {
"projectKey": [
"~@kibana-highlighted-field@user@/kibana-highlighted-field@"
],
"projectKey.keyword": [
"@kibana-highlighted-field@~user@/kibana-highlighted-field@"
],
"user": [
"@kibana-highlighted-field@user@/kibana-highlighted-field@"
]
},
"sort": [
1600059800137
]
}

更新***

我使用了prerana下面的答案在我的查询中使用前缀

当我使用前缀和范围时仍然有问题-我得到了以下错误-我缺少什么？？

GET /branch/_search
{
"query": {
"prefix": {
"projectKey": "~"
},
"range": {
"date": {
"gte": "2020-09-14",
"lte": "2020-09-14"
}
}
}
}

{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 6,
"col": 5
}
],
"type": "parsing_exception",
"reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 6,
"col": 5
},
"status": 400
}

如果我很了解您的问题，我建议创建一个自定义分析器来搜索特殊字符~。

在将~替换为__SPECIAL__:时，我在本地进行了如下测试

我创建了一个带有自定义char_filter的索引，并在projectKey字段中添加了一个字段。新的multi_field的名称为special_characters。

这是映射：

PUT wildcard-index
{
"settings": {
"analysis": {
"char_filter": {
"special-characters-replacement": {
"type": "mapping",
"mappings": [
"~ => __SPECIAL__"
]
}
},
"analyzer": {
"special-characters-analyzer": {
"tokenizer": "standard",
"char_filter": [
"special-characters-replacement"
]
}
}
}
},
"mappings": {
"properties": {
"projectKey": {
"type": "text",
"fields": {
"special_characters": {
"type": "text",
"analyzer": "special-characters-analyzer"
}
}
}
}
}
}

然后我在索引中摄入了以下内容：

"projectKey"："content1~"；

"projectKey"："这~是一个内容">

"projectKey"：&&quot；~"路上的汽车"；

"projectKey"："o~ngram">

然后，查询是：

GET wildcard-index/_search
{
"query": {
"match": {
"projectKey.special_characters": "~"
}
}
}

回应是：

"hits" : [
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "h1hKmHQBowpsxTkFD9IR",
"_score" : 0.43250346,
"_source" : {
"projectKey" : "content1 ~"
}
},
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "iFhKmHQBowpsxTkFFNL5",
"_score" : 0.3034693,
"_source" : {
"projectKey" : "This ~ is a content"
}
},
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "-lhKmHQBowpsxTkFG9Kg",
"_score" : 0.3034693,
"_source" : {
"projectKey" : "~ cars on the road"
}
}
]

请告诉我，如果你有任何问题，我很乐意帮助你。

注意：如果~后面有空格，则此方法有效。您可以从响应中看到，第4个数据没有显示。

而@hansley answer可以工作，但它需要您创建一个自定义分析器，而且正如您所提到的，您只想获得以~开头的文档，但在他的结果中，我看到了所有包含~的文档，因此提供我的答案只需要很少的配置，并且可以按需工作。

索引映射默认值，因此仅在文档和ES下方的索引将为所有text字段创建具有.keyword字段的默认映射

索引样本文档

{
"title" : "content1 ~"
}
{
"title" : "~ staring with"
}
{
"title" : "in between ~ with"
}

搜索查询应该从样本文档中获取明显的第二个文档

{
"query": {
"prefix" : { "title.keyword" : "~" }
}
}

和搜索结果

"hits": [
{
"_index": "pre",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "~ staring with"
}
}
]

更多信息请参阅前缀查询

更新1:

索引映射：

{
"mappings": {
"properties": {
"date": {
"type": "date" 
}
}
}
}

指数数据：

{
"date": "2015-02-01",
"title" : "in between ~ with"
}
{
"date": "2015-01-01",
"title": "content1 ~"
}
{
"date": "2015-02-01",
"title" : "~ staring with"
}
{
"date": "2015-02-01",
"title" : "~ in between with"
}

搜索查询：

{
"query": {
"bool": {
"must": [
{
"prefix": {
"title.keyword": "~"
}
},
{
"range": {
"date": {
"lte": "2015-02-05",
"gte": "2015-01-11"
}
}
}
]
}
}
}

搜索结果：

"hits": [
{
"_index": "stof_63924930",
"_type": "_doc",
"_id": "2",
"_score": 2.0,
"_source": {
"date": "2015-02-01",
"title": "~ staring with"
}
},
{
"_index": "stof_63924930",
"_type": "_doc",
"_id": "4",
"_score": 2.0,
"_source": {
"date": "2015-02-01",
"title": "~ in between with"
}
}
]

相关内容

最新更新

热门标签：