java - 尝试将 null 插入摄取附件字段时,ElasticSearch 返回错误
问题描述
我已经安装了摄取附件处理器,并且正在从一个索引读取文件路径并使用 java 代码documents
在另一个索引中索引文件内容。documents_attachment
在此过程中,如果文件可用,它将解码为 base64,并将这些内容附加到 json 字段fileContent
并在另一个 index 中索引这些字段documents_attachment
。
如果文件不可用,我会尝试将null
值附加到 json 字段fileContent
并尝试索引这些字段。在此过程中,当我尝试插入null
json 字段时,出现以下错误fileContent
。
请在下面找到错误。
ElasticsearchStatusException[Elasticsearch exception [type=exception, reason=java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=field [fileContent] is null, cannot parse.]];
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:573)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:549)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:456)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:429)
at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:312)
at com.es.utility.DocumentIndex.main(DocumentIndex.java:193)
Suppressed: org.elasticsearch.client.ResponseException: method [PUT], host [http://localhost:9200], URI [/document_attachment_dev/doc/129439?pipeline=document_attachment_dev&timeout=1m], status line [HTTP/1.1 500 Internal Server Error]
{"error":{"root_cause":[{"type":"exception","reason":"java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","header":{"processor_type":"attachment"}}],"type":"exception","reason":"java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","caused_by":{"type":"illegal_argument_exception","reason":"java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","caused_by":{"type":"illegal_argument_exception","reason":"field [fileContent] is null, cannot parse."}},"header":{"processor_type":"attachment"}},"status":500}
请找到我的java代码。
public class DocumentIndex {
private final static String INDEX = "documents_local";
private final static String ATTACHMENT = "document_attachment";
private final static String TYPE = "doc";
private static final Logger logger = Logger.getLogger(Thread.currentThread().getStackTrace()[0].getClassName());
public static void main(String args[]) throws IOException {
RestHighLevelClient restHighLevelClient = null;
Document doc=new Document();
logger.info("Started Indexing the Document.....");
try {
restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http")));
} catch (Exception e) {
System.out.println(e.getMessage());
}
//Fetching Id, FilePath & FileName from Document Index.
SearchRequest searchRequest = new SearchRequest(INDEX);
searchRequest.types(TYPE);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryBuilder qb = QueryBuilders.matchAllQuery();
searchSourceBuilder.query(qb);
searchSourceBuilder.size(3000);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = null;
try {
searchResponse = restHighLevelClient.search(searchRequest);
} catch (IOException e) {
e.getLocalizedMessage();
}
SearchHit[] searchHits = searchResponse.getHits().getHits();
long totalHits=searchResponse.getHits().totalHits;
logger.info("Total Hits --->"+totalHits);
int line=1;
Map<String, Object> jsonMap ;
for (SearchHit hit : searchHits) {
String encodedfile = null;
File file=null;
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
doc.setId((int) sourceAsMap.get("id"));
doc.setApp_language(sourceAsMap.get("app_language").toString());
String filepath=doc.getPath().concat(doc.getFilename());
logger.info("Line Number--> "+line+++"ID---> "+doc.getId()+"File Path --->"+filepath);
try(PrintWriter out = new PrintWriter(new FileOutputStream(new File("d:\\AllFilePath.txt"), true)) ){
out.println("Line Number--> "+line+"ID---> "+doc.getId()+"File Path --->"+filepath);
}
file = new File(filepath);
if(file.exists() && !file.isDirectory()) {
try {
try(PrintWriter out = new PrintWriter(new FileOutputStream(new File("d:\\AvailableFile.txt"), true)) ){
out.println("Line Number--> "+line+++"ID---> "+doc.getId()+"File Path --->"+filepath);
}
FileInputStream fileInputStreamReader = new FileInputStream(file);
byte[] bytes = new byte[(int) file.length()];
fileInputStreamReader.read(bytes);
encodedfile = new String(Base64.getEncoder().encodeToString(bytes));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
jsonMap = new HashMap<>();
jsonMap.put("id", doc.getId());
jsonMap.put("app_language", doc.getApp_language());
jsonMap.put("fileContent", encodedfile); // inserting null here when file is not available and it is not able to encoded.
String id=Long.toString(doc.getId());
IndexRequest request = new IndexRequest(ATTACHMENT, "doc", id )
.source(jsonMap)
.setPipeline(ATTACHMENT);
PrintStream printStream = new PrintStream(new File("d:\\exception.txt"));
try {
IndexResponse response = restHighLevelClient.index(request);
} catch(ElasticsearchException e) {
if (e.status() == RestStatus.CONFLICT) {
}
e.printStackTrace(printStream);
}
line++;
}
logger.info("Indexing done.....");
}
}
请找到我的映射详细信息
PUT _ingest/pipeline/document_attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "fileContent"
}
}
]
}
PUT document_attachment
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
},
"product_catalog_keywords_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings" : {
"doc" : {
"properties" : {
"attachment" : {
"properties" : {
"content" : {
"type" : "text",
"analyzer": "custom_analyzer"
},
"content_length" : {
"type" : "long"
},
"content_type" : {
"type" : "text"
},
"language" : {
"type" : "text"
}
}
},
"fileContent" : {
"type" : "text"
},
"id": {
"type": "long"
},
"app_language" : {
"type" : "text"
},
}
}
}
}
解决方案
我正在为摄取附件处理器使用以下映射配置,并且当文件内容不可用(null)时它正在工作文件。
PUT _ingest/pipeline/document_attachment
{
"description" : "my first pipeline with handled exceptions",
"processors" : [
{
"attachment" : {
"field" : "fileContent",
"on_failure" : [
{
"set" : {
"field" : "error",
"value" : "{{ _ingest.on_failure_message }}"
}
}
]
}
}
]
}
推荐阅读
- python - Keras:使用 model.train_on_batch() 和 model.fit() 获得不同的准确度。可能是什么原因以及如何解决?
- java - 包括来自不同于根目录的 .pde 文件
- highcharts - Highcharts:水平条形图中条形上方的 xAxis 标签
- javascript - 为 .ajaxError() 中重新发送的 Ajax 请求执行 .done() 函数
- r - 将 Web 服务中的折线添加到 R 传单地图
- python - 将字典写入 Sql 表,代码执行没有任何问题,但值没有绑定到 SQL 表中
- elasticsearch - 衰减函数内的术语过滤器
- reactjs - 如何在加载时打开 React Native Maps 标记的标注
- java - NoSuchAlgorithmException[1.2.840.113549.1.1.1 KeyFactory 不可用];
- python - 无法从“img”标签中提取“src”属性