Analyzer在Elasticsearch中忽略重音和复数单数



我正在进行搜索查询时忽略重音符号和复数/单数。我从这里复制了西班牙语分析器,只留下了stemmerhttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html

你可以在Python中检查我的代码(我从CSV后者批量处理数据(:

settings={
"settings": {
"analysis": {
"filter": {
"spanish_stemmer": {
"type":       "stemmer",
"language":   "light_spanish"
}
},
"analyzer": {
"rebuilt_spanish": {
"tokenizer":  "standard",
"filter": [
"lowercase",
"spanish_stemmer"
]
}
}
}
}
}

es.indices.create(index="activities", body=settings)

然而,当我尝试从失眠(如geometricogeométricogeométricosgeometricos(中进行GET查询时,我得到0个结果,并且有一个标题为Cuerpos geométricos的文档。它应该匹配,因为我不想在重音和复数单数上有区别。有什么想法吗?

我做的GET查询:

{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "geométricos",
"fields": [
"Descripcion",
"Nombre",
"Tags"
],
"analyzer":"rebuilt_spanish"
}
}
}
}
}

您需要将ASCII folding token filter添加到令牌过滤器中,请查看此处的官方文档。所以你的Analyzer应该是这样的:

安莱泽:

"analysis": {
"filter": {
"spanish_stemmer": {
"type":       "stemmer",
"language":   "light_spanish"
}
},
"analyzer": {
"rebuilt_spanish": {
"tokenizer":  "standard",
"filter": [
"asciifolding", // ASCII folding token filter
"lowercase",
"spanish_stemmer"
]
}
}
}
}

相关内容

  • 没有找到相关文章

最新更新