ElasticSearch:搜索flatten属性中所有键值相同的所有文档

假设我在elasticsearch中有两种文档，当"map"扁平化的:1 .

doc1: {
"name": "foo1",
"map": {
"key1": 100,
"key2": 100
}
}

doc2: {
"name": "foo2",
"map": {
"key1": 100,
"key2": 90
}
}

我是否可以搜索elasticsearch以获取其"映射"属性的所有文档(例如:Key1, key2)具有相同的值(例如:"100"对于它们的所有属性(key1=100, key2=100)，所以它将返回doc1，而不需要事先知道在"map"下存在什么属性;财产吗?

谢谢!

是。实际上有两种方法可以实现你的目标:

通过摄取管道向文档添加一个标志字段，然后对这个新字段运行一个常规过滤器(推荐)
通过运行时字段动态生成标志字段

# 1是推荐的方法，因为在每个查询上迭代每个文档并不能很好地扩展。创建标志字段的效率要高得多。给定你的2个文档:

POST test_script/_doc
{
"name": "foo1",
"map": {
"key1": 100,
"key2": 100
}
}
POST test_script/_doc
{
"name": "foo2",
"map": {
"key1": 100,
"key2": 90
}
}

1。通过摄取管道向文档添加标志字段(推荐)

创建摄取管道:

PUT _ingest/pipeline/is_100_field
{
"processors": [
{
"script": {
"source": "def keys_100 = 0;ndef keys = ctx['map'].keySet();nnfor (key in keys) {n    if(ctx['map'][key] == 100){n        keys_100 = keys_100 + 1;n    }n}nnctx.is_100 = keys.size() == keys_100;",
"ignore_failure": true
}
}
]
}

你现在可以使用这个摄取管道重新索引你的数据，或者配置它应用于每个文档:

重建索引:

POST your_index/_update_by_query?pipeline=is_100_field

摄入

POST your_index/_doc?pipeline=is_100_field

这将生成以下文档模型

{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "test_script",
"_id": "78_AvoQB5Gw0WET88nZE",
"_score": 1,
"_source": {
"name": "foo1",
"map": {
"key1": 100,
"key2": 100
},
"is_100": true
}
},
{
"_index": "test_script",
"_id": "8s_AvoQB5Gw0WET8-HYO",
"_score": 1,
"_source": {
"name": "foo2",
"map": {
"key1": 100,
"key2": 90
},
"is_100": false
}
}
]
}
}

现在你可以运行一个常规的过滤器，这是最有效的方式:

GET test_script/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"is_100": true
}
}
]
}
}
}

通过运行时字段动态生成标志字段

脚本是相同的，但现在字段将动态生成，而不是从数据中摄取。我们可以将这个字段添加到映射中，或者添加到查询中:

映射:

PUT test_script_runtime/
{
"mappings": {
"runtime": {
"is_100": {
"type": "boolean",
"script": {
"source": """
def keys_100 = 0;
def keys = params._source['map'].keySet();

for (key in keys) {
if(params._source['map'][key] == 100){
keys_100 = keys_100 + 1;
}
}

emit(keys.size() == keys_100);
"""
}
}
},
"properties": {
"map": {"type": "object"},
"name": {"type": "text"}
}
}
}

查询

GET test_script/_search
{
"runtime_mappings": {
"is_100": {
"type": "boolean",
"script": {
"source": """
def keys_100 = 0;
def keys = params._source['map'].keySet();

for (key in keys) {
if(params._source['map'][key] == 100){
keys_100 = keys_100 + 1;
}
}

emit(keys.size() == keys_100);
"""
}
}
},
"query": {
"bool": {
"filter": [
{
"term": {
"is_100": true
}
}
]
}
}
}

如果你决定索引运行时字段，你可以很容易地做到:https://www.elastic.co/guide/en/elasticsearch/reference/current/runtime-indexed.html

相关内容

最新更新

热门标签：