如何通过在 java 的 post request 中发送文件来将数据导入 elasticsearch



目前我正在使用以下cmd将数据导入到弹性搜索中

curl -XPOST 'localhost:9200/index/_bulk?pretty' --data-binary @required.json

ElasticSearch Bulk API

现在我正在创建一个 java 控制台应用程序,我需要在 post 请求中发送此文件。我能够使用 SoapUI 执行此操作,即附加带有标头的文件

内容类型:文本/JavaScript

从 SOAPUI 执行此请求是成功的,并且已加载数据。


我已经研究了如何进行发布请求,并且必须遵循代码。所以我的问题是:

  1. 我们如何将文件附加到请求中?
  2. 如果无法连接文件,还有什么其他可能的解决方案?

    URL url = new URL("http://localhost:9200/index/_bulk/");
    HttpURLConnection httpCon = (HttpURLConnection) url.openConnection();
    httpCon.setDoOutput(true);
    httpCon.setRequestMethod("POST");
    httpCon.setRequestProperty("Content-Type", "text/javascript"); // (or text/plain)
    httpCon.setRequestProperty("Accept", "application/json");
    OutputStreamWriter out = new OutputStreamWriter(httpCon.getOutputStream());
    out.write(...filedata...); // <------ How to Put file data in output stream ?
    out.flush();
    out.close();
    

如何将文件内容写入输出

//out.write(...filedata...); // <------ How to Put file data in output stream ?
// Create a path to your file
Path path = Paths.get("D:/TEMP/bulk.json"); // <------ your file name
// Open a BufferedReader to your file - change CharSet if necessary
BufferedReader fReader = Files.newBufferedReader(path, Charset.defaultCharset());
String data = null;
// Read each line, append "n" to it as required by Bulk API and write it to the server
while ((data = fReader.readLine()) != null) {
    out.write(data + "n"); // <------ Put each line in output stream
    out.flush();            // <------ You may do this outside the loop once
}
out.close();

注意:有多种方法可以将文件内容写入输出流。我刚刚指定了一个。你可以探索其他人。

批量.json

{ "index" : { "_index" : "test", "_type" : "string", "_id" : "10" } }
{ "field1" : "value10" }
{ "index" : { "_index" : "test", "_type" : "string", "_id" : "20" } }
{ "field1" : "value20" }
{ "index" : { "_index" : "test", "_type" : "string", "_id" : "30" } }
{ "field1" : "value30" }

测试

http://localhost:9200/test/string/10
{"_index":"test","_type":"string","_id":"10","_version":3,"found":true,"_source":{ "field1" : "value10" }}

另外,由于您的客户端是 Java,您是否考虑过使用 ElasticSearch Java Client API?我不太了解它,但它应该支持批量请求。查看 client.bulk(BulkRequest) 方法。

最新更新