ElasticSearch:是否可以使用Regex字段进行查询



我已使用以下索引设置将数据索引到ElasticSearch中:

KNN_INDEX = {
"settings": {
"index.knn": True,
"index.knn.space_type": "cosinesimil",
"index.mapping.total_fields.limit": 10000,
"analysis": {
"analyzer": {
"default": {
"type": "standard",
"stopwords": "_english_"
}
}
}
},
"mappings": {
"dynamic_templates": [
{
"sentence_vector_template": {
"match": "sent_vec*",
"mapping": {
"type": "knn_vector",
"dimension": 384,
"store": True
}
}
},
{
"sentence_template": {
"match": "sentence*",
"mapping": {
"type": "text",
"store": True
}
}
}
],
'properties': {
"metadata": {
"type": "object"
}
}
}
}

以下是我正在索引到ElasticSearch的几个示例文档:

{
# DOC 1
"sentence_0": "Machine learning for aquatic plastic litter detection, classification and quantification (APLASTIC-Q)Large quantities of mismanaged plastic waste are polluting and threatening the health of the blue planet."
"sentence_1": "As such, vast amounts of this plastic waste found in the oceans originates from land."
"sentence_2": "It finds its way to the open ocean through rivers, waterways and estuarine systems."
},
{
# DOC 2
"sentence_0": "What predicts persistent early conduct problems?"
"sentence_1": "Evidence from the Growing Up in Scotland cohortBackground There is a strong case for early identification of factors predicting life-course-persistent conduct disorder."
"sentence_2": "The authors aimed to identify factors associated with repeated parental reports of preschool conduct problems."
"sentence_3": "Method Nested caseecontrol study of Scottish children who had behavioural data reported by parents at 3, 4 and 5 years."
"sentence_4": "Results 79 children had abnormal conduct scores at all three time points ('persistent conduct problems') and 434 at one or two points ('inconsistent conduct problems')."
}

每个索引文档可以有不同数量的句子。对于查询,我想搜索所有文档中的所有句子。我能够搜索特定的";句子编号";在所有使用以下查询的文档中:

query_body = {
"query": {
"match": {
"sentence_0": "persistent"
}
}
}
result = client.search(index=INDEX_NAME, body=query_body)
print(result)

但我要找的是下面这样的东西:

query_body = {
"query": {
"match": {
"sentence_*": "persistent"
}
}
}
result = client.search(index=INDEX_NAME, body=query_body)
print(result)

不过,上面的查询不起作用。是否可以执行这样的查询搜索?谢谢

使用query_string,它支持字段名中的regex

{
"query": {
"query_string": {
"fields": ["sentence*"],
"query": "persistent"
}
}
}

最新更新