ElasticSearch 尝试在摄取附件字段中插入空值时返回错误



我已经安装了摄取附件处理器,并且正在使用java代码从一个索引documents读取文件路径,并在另一个索引documents_attachment中索引文件内容。

在此过程中,如果文件可用,它将解码为 base64,并且这些内容附加到 json 字段fileContent并在另一个索引documents_attachment中索引这些字段。

如果该文件不可用,我尝试null作为值附加到 json 字段fileContent,并尝试为这些字段编制索引。在此过程中,当我尝试将null插入 json 字段时,出现以下错误fileContent.

请在下面找到错误。

ElasticsearchStatusException[Elasticsearch exception [type=exception, reason=java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=field [fileContent] is null, cannot parse.]];
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:573)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:549)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:456)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:429)
at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:312)
at com.es.utility.DocumentIndex.main(DocumentIndex.java:193)
Suppressed: org.elasticsearch.client.ResponseException: method [PUT], host [http://localhost:9200], URI [/document_attachment_dev/doc/129439?pipeline=document_attachment_dev&timeout=1m], status line [HTTP/1.1 500 Internal Server Error]
{"error":{"root_cause":[{"type":"exception","reason":"java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","header":{"processor_type":"attachment"}}],"type":"exception","reason":"java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","caused_by":{"type":"illegal_argument_exception","reason":"java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","caused_by":{"type":"illegal_argument_exception","reason":"field [fileContent] is null, cannot parse."}},"header":{"processor_type":"attachment"}},"status":500}

请找到我的 java 代码。

public class DocumentIndex {
private final static String INDEX = "documents_local";  
private final static String ATTACHMENT = "document_attachment"; 
private final static String TYPE = "doc";
private static final Logger logger = Logger.getLogger(Thread.currentThread().getStackTrace()[0].getClassName());
public static void main(String args[]) throws IOException {

RestHighLevelClient restHighLevelClient = null;
Document doc=new Document();
logger.info("Started Indexing the Document.....");
try {
restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http")));
} catch (Exception e) {
System.out.println(e.getMessage());
}

//Fetching Id, FilePath & FileName from Document Index. 
SearchRequest searchRequest = new SearchRequest(INDEX); 
searchRequest.types(TYPE);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryBuilder qb = QueryBuilders.matchAllQuery();
searchSourceBuilder.query(qb);
searchSourceBuilder.size(3000);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = null;
try {
searchResponse = restHighLevelClient.search(searchRequest);
} catch (IOException e) {
e.getLocalizedMessage();
}
SearchHit[] searchHits = searchResponse.getHits().getHits();
long totalHits=searchResponse.getHits().totalHits;
logger.info("Total Hits --->"+totalHits);
int line=1;
Map<String, Object> jsonMap ;
for (SearchHit hit : searchHits) {
String encodedfile = null;
File file=null;
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
doc.setId((int) sourceAsMap.get("id"));
doc.setApp_language(sourceAsMap.get("app_language").toString());

String filepath=doc.getPath().concat(doc.getFilename());
logger.info("Line Number--> "+line+++"ID---> "+doc.getId()+"File Path --->"+filepath);
try(PrintWriter out = new PrintWriter(new FileOutputStream(new File("d:\AllFilePath.txt"), true))  ){
out.println("Line Number--> "+line+"ID---> "+doc.getId()+"File Path --->"+filepath);
}
file = new File(filepath);
if(file.exists() && !file.isDirectory()) {
try {
try(PrintWriter out = new PrintWriter(new FileOutputStream(new File("d:\AvailableFile.txt"), true))  ){
out.println("Line Number--> "+line+++"ID---> "+doc.getId()+"File Path --->"+filepath);
}
FileInputStream fileInputStreamReader = new FileInputStream(file);
byte[] bytes = new byte[(int) file.length()];
fileInputStreamReader.read(bytes);
encodedfile = new String(Base64.getEncoder().encodeToString(bytes));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
jsonMap = new HashMap<>();
jsonMap.put("id", doc.getId());
jsonMap.put("app_language", doc.getApp_language());
jsonMap.put("fileContent", encodedfile); // inserting null here when file is not available and it is not able to encoded.
String id=Long.toString(doc.getId());
IndexRequest request = new IndexRequest(ATTACHMENT, "doc", id )
.source(jsonMap)
.setPipeline(ATTACHMENT);
PrintStream printStream = new PrintStream(new File("d:\exception.txt"));
try {
IndexResponse response = restHighLevelClient.index(request);
} catch(ElasticsearchException e) {
if (e.status() == RestStatus.CONFLICT) {
}
e.printStackTrace(printStream);
}
line++;
}
logger.info("Indexing done.....");
}
}

请查找我的映射详细信息

PUT _ingest/pipeline/document_attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "fileContent"
}
}
]
}
PUT document_attachment
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
},
"product_catalog_keywords_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings" : {
"doc" : {
"properties" : {
"attachment" : {
"properties" : {
"content" : {
"type" : "text",
"analyzer": "custom_analyzer"
},
"content_length" : {
"type" : "long"
},
"content_type" : {
"type" : "text"
},
"language" : {
"type" : "text"
}
}
},
"fileContent" : {
"type" : "text"
},
"id": {
"type": "long"
},
"app_language" : {
"type" : "text"
},
}
}
}
}

我使用以下映射配置用于摄取附件处理器,当文件内容不可用时,它是工作文件(空)。

PUT _ingest/pipeline/document_attachment
{
"description" : "my first pipeline with handled exceptions",
"processors" : [
{
"attachment" : {
"field" : "fileContent",
"on_failure" : [
{
"set" : {
"field" : "error",
"value" : "{{ _ingest.on_failure_message }}"
}
}
]
}
}
]
}

最新更新