我有一个json结构,如下所示:
{"DocumentName":"es","DocumentId":"2","Content": [{"PageNo":1,"Text": "The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."},{"PageNo":2,"Text": "The query string is processed using the same analyzer that was applied to the field during indexing."}]}
我需要获取Content.Text字段的词干分析结果。为此,我在创建索引的同时创建了一个映射。如下所示:
curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d"{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "my_stemmer"]
}
},
"filter": {
"my_stemmer": {
"type": "stemmer",
"name": "english"
}
}
}
}
}, {
"mappings": {
"properties": {
"DocumentName": {
"type": "text"
},
"DocumentId": {
"type": "keyword"
},
"Content": {
"properties": {
"PageNo": {
"type": "integer"
},
"Text": "_all": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "my_analyzer"
}
}
}
}
}
}
}"
我检查了创建的分析器:
curl -X GET "localhost:9200/myindex/_analyze?pretty" -H "Content-Type: application/json" -d"{"analyzer":"my_analyzer","text":"indexing"}"
它给出的结果是:
{
"tokens" : [
{
"token" : "index",
"start_offset" : 0,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 0
}
]
}
但在将json上传到索引后,当我尝试搜索"索引"时,它返回了0个结果。
res = requests.get('http://localhost:9200')
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res= es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})
任何帮助都将不胜感激。提前谢谢。
忽略我的评论。茎干器在工作。尝试以下操作:
映射:
curl -X DELETE "localhost:9200/myindex"
curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d'
{
"settings":{
"analysis":{
"analyzer":{
"english_exact":{
"tokenizer":"standard",
"filter":[
"lowercase"
]
}
}
}
},
"mappings":{
"properties":{
"DocumentName":{
"type":"text"
},
"DocumentId":{
"type":"keyword"
},
"Content":{
"properties":{
"PageNo":{
"type":"integer"
},
"Text":{
"type":"text",
"analyzer":"english",
"fields":{
"exact":{
"type":"text",
"analyzer":"english_exact"
}
}
}
}
}
}
}
}'
数据:
curl -XPOST "localhost:9200/myindex/_doc/1" -H "Content-Type: application/json" -d'
{
"DocumentName":"es",
"DocumentId":"2",
"Content":[
{
"PageNo":1,
"Text":"The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."
},
{
"PageNo":2,
"Text":"The query string is processed using the same analyzer that was applied to the field during indexing."
}
]
}'
查询:
curl -XGET 'localhost:9200/myindex/_search?pretty' -H "Content-Type: application/json" -d '
{
"query":{
"simple_query_string":{
"fields":[
"Content.Text"
],
"query":"index"
}
}
}'
正如预期的那样,只返回了一个文档。我还测试了以下词干,它们都能正确地使用所提出的映射:apply(applied(,texts。
Python示例:
import requests
from elasticsearch import Elasticsearch
res = requests.get('http://localhost:9200')
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res = es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})
print(res)
在Elasticsearch 7.4上测试。