ElasticSearch在最后一个必填字段到达后未能生成[请求]



我正在LinkSO数据集上使用ElasticSearch进行排名评估。我已经制作了使用Kibana卷曲所需的必要文件,并制作了我的排名函数。然而,我收到一个错误

BadRequestError: BadRequestError(400, 'x_content_parse_exception', 'Failed to build [request] after last required field arrived')

这是我获得NDCG排名的代码

def get_ndcg(dataframe, input_index="java_lm"):
ndcg_list = []
# loop through each qid1
for i in range(0, len(dataframe["qid1"]), 30):
qid1_title = es.get(index=input_index, id=dataframe["qid1"][i])['_source']['title']

# load ratings from the json file
f = open("qids/" + input_index + "/" + str(dataframe["qid1"][i]) + ".json")
data = json.load(f)
_search = ranking(dataframe["qid1"][i], qid1_title, ratings=data)

result = es.rank_eval(index=input_index, body=_search)

ndcg = result['metric_score']
ndcg_list.append(ndcg)

return ndcg_list

错误出现在es.rank_eval((函数上

我有的排名功能

def ranking(qid1, qid1_title, ratings):
_search = {
"requests": [
{
"id": str(qid1),
"request": {
"query": {
"bool": {
"must_not": {
"match": {
"_id": qid1
}
},
"should": [
{
"match": {
"title": {
"query": qid1_title,
"boost": 3.0,
"analyzer": "my_analyzer"
}
}
},
{
"match": {
"body": {
"query": qid1_title,
"boost": 0.5,
"analyzer": "my_analyzer"
}
}
},
{
"match": {
"answer": {
"query": qid1_title,
"boost": 0.5,
"analyzer": "my_analyzer"
}
}
}
]
}
}
},
"ratings": ratings
}
],
"metric": {
"dcg": {
"k": 10,
"normalize": True
}
}
}
return _search

在_search下,我的评级文件是一个json,格式为

[
{"_index": "java_lm", "_id": "15194804", "rating": 0},
{"_index": "java_lm", "_id": "18264178", "rating": 0},
{"_index": "java_lm", "_id": "16225177", "rating": 1},
{"_index": "java_lm", "_id": "16445238", "rating": 0},
{"_index": "java_lm", "_id": "17233226", "rating": 0}
]

我在Kibana上加载模板的PUT命令是

PUT /java_lm
{
"settings": {
"similarity": {
"LM": {
"type": "LMDirichlet",
"mu": 2000
}
},
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "whitespace",
"filter": [
"lowercase",
"porter_stem"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer",
"similarity": "LM"
},
"body": {
"type": "text",
"analyzer": "my_analyzer",
"similarity": "LM"
},
"answer": {
"type": "text",
"analyzer": "my_analyzer",
"similarity": "LM"
}
}
}
}

我似乎没有发现哪里出了问题。有人能评论一下如何改正吗?

错误创建json文件以对文档进行排名时会出现错误。json文件需要进行检查,以便根据需要具有尽可能多的属性。

最新更新