首页 > 解决方案 > 平面文件项目阅读器,自定义记录分隔符

问题描述

我需要解析的平面文件

column1|column2|column3$#
data1|data2|data3$#

在哪里

| - pipe line delimiter 
$# - record delimiter

平面文件项目阅读器:我尝试使用自定义记录分隔符策略,我尝试覆盖isEndofRecordSuffixRecordSeparatorPolicy setSuffix()没有运气。它没有将其识别$#为记录分隔符,并且我收到了一个平面文件解析异常。

我有一个解析器单义性解析器来添加自定义记录分隔符。但是,我不确定如何将 CSV Parser 设置添加到我的平面文件阅读器方法中。

private CsvParser csvParserSetting(BeanListProcessor<Employee> rowProcessor) {
    CsvParserSettings settings = new CsvParserSettings();
    settings.getFormat().setLineSeparator("$#");//$#
    settings.getFormat().setDelimiter("|");//|
    settings.setIgnoreLeadingWhitespaces(true);
    settings.setNumberOfRowsToSkip(1);
    settings.setProcessor(rowProcessor);
    CsvParser parser = new CsvParser(settings);
    return parser;
}

@Bean
@StepScope
public FlatFileItemReader<Employee> myReader() throws FileNotFoundException {BeanListProcessor<Employee> rowProcessor = new BeanListProcessor<Employee>(Employee.class);
    CsvParser parser = csvParserSetting(rowProcessor);
    Request request=MappingUtil.requestMap.get("myRequest");
    InputStream inputStream = awsClient.getInputStreamObject(request.getFileKeyPath());
    CustomRecordSeparatorPolicy customRecordSeparatorPolicy=new CustomRecordSeparatorPolicy();
    //stomRecordSeparatorPolicy.isEndOfRecord(record)
     FlatFileItemReader<Employee> reader = new FlatFileItemReader<>();
    reader.setResource(new InputStreamResource(inputStream));

     reader.setName("filreader");
        reader.setLinesToSkip(1);
       // customRecordSeparatorPolicy.setSuffix("$#");
      //  reader.setRecordSeparatorPolicy(customRecordSeparatorPolicy);

        //reader.setRecordSeparatorPolicy(recordSeparatorPolicy);
        reader.setLineMapper(new DefaultLineMapper<Employee>() {{
          setLineTokenizer(new DelimitedLineTokenizer() {{
            setNames(MyConstats.FIELDS);
            setDelimiter("|");
          }});
          setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {{
            setTargetType(Employee.class);
          }});
        }});
        return reader;
        }

import org.springframework.batch.item.file.separator.RecordSeparatorPolicy;

公共类 CustomSuffixRecordSeparatorPolicy 实现 RecordSeparatorPolicy {

public static final String DEFAULT_SUFFIX = "$#";
private String suffix = DEFAULT_SUFFIX;
private boolean ignoreWhitespace = false;

 public void setSuffix(String suffix) {
    this.suffix = suffix;
}
public void setIgnoreWhitespace(boolean ignoreWhitespace) {
    this.ignoreWhitespace = ignoreWhitespace;
}
/*@Override
public boolean isEndOfRecord(String record) {
    int fieldCount = record.split("|").length;
   // String recordvalue[] =record.split("\\|");
    if(fieldCount == 126) {
        return true;
    } else {
        return false;
    }
}*/
public boolean isEndOfRecord(String line) {
    if (line == null) {
        return true;
    }
    String trimmed = ignoreWhitespace ? line.trim() : line;
    return trimmed.endsWith(suffix);
}

public String postProcess(String record) {
    if (record==null) {
        return null;
    }
    return record.substring(0, record.lastIndexOf(suffix));
}
@Override
public String preProcess(String record) {
    return record;
}

}

header1|header2|header3$#
value1|value2|value3$#value11|value22|value33

header1|header2|header3$#value1|value2|value3$#value11|value22|value33

header1|header2|header3$#
value1|value2|value3$#
value11|value22|value33$#

在迭代 1 中,它正确解析行标题,当它进入第二次时,它尝试读取行 value1|value2|value3$#value11|value22|value33 并且记录没有被拆分以逐条区分记录。

最后它失败了

private String applyRecordSeparatorPolicy(String line) throws IOException {
    String record = line;
    while (line != null && !recordSeparatorPolicy.isEndOfRecord(record)) {
        line = this.reader.readLine();
        if (line == null) {
            if (StringUtils.hasText(record)) {
                // A record was partially complete since it hasn't ended but
                // the line is null
                throw new FlatFileParseException("Unexpected end of file before record complete", record, lineCount);
            }
            else {
                // Record has no text but it might still be post processed
                // to something (skipping preProcess since that was already
                // done)
                break;
            }
        }
        else {
            lineCount++;
        }
        record = recordSeparatorPolicy.preProcess(record) + line;
    }
    return recordSeparatorPolicy.postProcess(record);

}

我现在尝试的记录结束方法。如果 header1|hearders...|$#values|| 看起来像这样失败 在同一行它失败了。在我的情况下,有 126 个标题 $#values-126$#values-126$#etc。

private int getPipeCount(String s){ 
        String tmp = s;
        int index = -1;
        int count = 0;
        while ((index=tmp.indexOf("|"))!=-1) {
        tmp = tmp.substring(index+1);
        count++;
        }
        return count;
    }

     public boolean isEndOfRecord(String line) {
        return getPipeCount(line)==126;
    }

标签: spring-batchunivocity

解决方案


推荐阅读