验证失败:1:批量索引中未添加任何请求



我有一个JSON文件,我需要在ElasticSearch服务器上索引它。

JSON 文件如下所示:

{
    "sku": "1",
    "vbid": "1",
    "created": "Sun, 05 Oct 2014 03:35:58 +0000",
    "updated": "Sun, 06 Mar 2016 12:44:48 +0000",
    "type": "Single",
    "downloadable-duration": "perpetual",
    "online-duration": "365 days",
    "book-format": "ePub",
    "build-status": "In Inventory",
    "description": "On 7 August 1914, a week before the Battle of Tannenburg and two weeks before the Battle of the Marne, the French army attacked the Germans at Mulhouse in Alsace. Their objective was to recapture territory which had been lost after the Franco-Prussian War of 1870-71, which made it a matter of pride for the French. However, after initial success in capturing Mulhouse, the Germans were able to reinforce more quickly, and drove them back within three days. After forty-three years of peace, this was the first test of strength between France and Germany. In 1929 Karl Deuringer wrote the official history of the battle for the Bavarian Army, an immensely detailed work of 890 pages; First World War expert and former army officer Terence Zuber has translated this study and edited it down to more accessible length, to produce the first account in English of the first major battle of the First World War.",
    "publication-date": "07/2014",
    "author": "Deuringer, Karl",
    "title": "The First Battle of the First World War: Alsace-Lorraine",
    "sort-title": "First Battle of the First World War: Alsace-Lorraine",
    "edition": "0",
    "sampleable": "false",
    "page-count": "0",
    "print-drm-text": "This title will only allow printing of 2 consecutive pages at a time.",
    "copy-drm-text": "This title will only allow copying of 2 consecutive pages at a time.",
    "kind": "book",
    "fro": "false",
    "distributable": "true",
    "subjects": {
      "subject": [
        {
          "-schema": "bisac",
          "-code": "HIS027090",
          "#text": "World War I"
        },
        {
          "-schema": "coursesmart",
          "-code": "cs.soc_sci.hist.milit_hist",
          "#text": "Social Sciences -> History -> Military History"
        }
      ]
    },   
   "pricelist": {
      "publisher-list-price": "0.0",
      "digital-list-price": "7.28"
    },
    "publisher": {
      "publisher-name": "The History Press",
      "imprint-name": "The History Press Ireland"
    },
    "aliases": {
      "eisbn-canonical": "1",
      "isbn-canonical": "1",
      "print-isbn-canonical": "9780752460864",
      "isbn13": "1",
      "isbn10": "0750951796",
      "additional-isbns": {
        "isbn": [
          {
            "-type": "print-isbn-10",
            "#text": "0752460862"
          },
          {
            "-type": "print-isbn-13",
            "#text": "97807524608"
          }
        ]
      }
    },
    "owner": {
      "company": {
        "id": "1893",
        "name": "The History Press"
      }
    },
    "distributor": {
      "company": {
        "id": "3658",
        "name": "asc"
      }
    }
  }

但是当我尝试使用命令索引此 JSON 文件时

curl -XPOST 'http://localhost:9200/_bulk' -d @1.json

我收到此错误:

{"error":{"root_cause":[{"type":"action_request_validation_exception","reason":"Validation Failed: 1: no requests added;"}],"type":"action_request_validation_exception","reason":"Validation Failed: 1: no requests added;"},"status":400}

我不知道我在哪里犯了错误。

Elasticsearch 的批量 API 使用了一种特殊的语法,它实际上是由单行编写的json文档组成的。查看文档。

语法非常简单。对于索引、创建和更新,您需要 2 个单行 json 文档。第一行告诉操作,第二行为文档提供索引/创建/更新。要删除文档,只需要操作行。例如(来自文档):

{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1"} }   
{ "doc" : {"field2" : "value2"} }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }

不要忘记用新行结束文件。然后,要调用批量 api,请使用以下命令:

curl -s -XPOST localhost:9200/_bulk --data-binary "@requests"

从文档中:

如果要向 curl 提供文本文件输入,则必须使用 --data-binary 标志而不是纯-d

添加

下一行(在邮递员的情况下输入,如果您在客户端 API 中使用 json 作为正文,请输入"")完成了我的工作

我有一个类似的问题,因为我想删除特定类型的特定文档,通过上面的答案,我设法让我的简单 bash 脚本终于工作了!

我有一个每行 (document_id.txt) document_id的文件,使用以下 bash 脚本,我可以删除具有上述document_id的某种类型的文档。

文件如下所示:

c476ce18803d7ed3708f6340fdfa34525b20ee90
5131a30a6316f221fe420d2d3c0017a76643bccd
08ebca52025ad1c81581a018febbe57b1e3ca3cd
496ff829c736aa311e2e749cec0df49b5a37f796
87c4101cb10d3404028f83af1ce470a58744b75c
37f0daf7be27cf081e491dd445558719e4dedba1

bash 脚本如下所示:

#!/bin/bash
es_cluster="http://localhost:9200"
index="some-index"
doc_type="some-document-type"
for doc_id in `cat document_id.txt`
do
    request_string="{"delete" : { "_type" : "${doc_type}", "_id" : "${doc_id}" } }"
    echo -e "${request_string}rnrn" | curl -s -XPOST "${es_cluster}/${index}/${doc_type}/_bulk" --data-binary @-
    echo
done

在经历了很多挫折之后,诀窍是使用 -e 选项回显并将 附加到 echo 的输出中,然后再将其管道化为 curl。

然后在 curl 中,我设置了 --data-binary 选项来阻止它去除_bulk端点所需的 ,然后是 @- 选项以使其从 stdin!

在我的情况下是一个奇怪的错误。我正在创建批量请求对象并在插入ElasticSearch之前清除它。

造成问题的行。

bulkRequest.requests().clear();

我的问题也是缺少n。如果你打印它,它会解析它并将其解释为换行符(所以看起来你缺少一个n)。万一对任何人有帮助

伪代码:

document = '{"index": {"_index": "users", "_id": "1"}} n {"first_name": "Bob"}'
print(document)

将打印

{"index": {"_index": "users", "_id": "1"}}
{"first_name": "Bob"}

但这没关系 - 只要字符串包含n分隔符,那么它应该可以工作

相关内容

最新更新