使用两个文件时,同义词规则无效



我有两个几千行的同义词文件,下面是导致问题的示例:

en_synonyms文件:

cereal, semolina, wheat

fr_synonyms文件:

ble, cereale, wheat

这是我得到的错误:

{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "failed to build synonyms"
}
],
"type": "illegal_argument_exception",
"reason": "failed to build synonyms",
"caused_by": {
"type": "parse_exception",
"reason": "Invalid synonym rule at line 1",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "term: wheat analyzed to a token (cereal) with position increment != 1 (got: 0)"
}
}
},
"status": 400
}

我使用的映射:

PUT wheat_syn
{
"mappings": {
"wheat": {
"properties": {
"description": {
"type": "text",
"fields": {
"synonyms": {
"type": "text",
"analyzer": "syn_text"
},
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
},
"settings": {
"number_of_shards": 1,
"analysis": {
"filter": {
"en_synonyms": {
"type": "synonym",
"tokenizer": "keyword",
"synonyms_path" : "analysis/en_synonyms.txt"
},
"fr_synonyms": {
"type": "synonym",
"tokenizer": "keyword",
"synonyms_path" : "analysis/fr_synonyms.txt"
}
},
"analyzer": {
"syn_text": {
"tokenizer": "standard",
"filter": ["lowercase", "en_synonyms", "fr_synonyms" ]
}
}
}
}
}

这两个文件都包含术语wheat。当我从其中一个文件中删除它时,索引就成功创建了。

我考虑过合并这两个文件,所以结果是:

cereal, semolina, wheat, ble, cereale

但在我的情况下,我不能手动完成,因为这将花费大量时间(我将寻找一种以编程方式完成的方法,这取决于这个问题的答案(

找到了一个简单的解决方案:

我没有使用两个文件,而是将en_synonymsfr_synonyms的内容连接在一个文件all_synonyms:中

cereal, semolina, wheat
ble, cereale, wheat

然后将其用于映射。

最新更新