首页 > 解决方案 > 使用 Spring Batch 读取以将输入 CSV 文件中的行转换为具有 1 对多关系的输出 CSV 文件

问题描述

我已经发布了这个问题: How to use Spring Batch read CSV,process it and write it as a CSV with one row can generate more than one row?

并查看了这些相关答案: Spring Batch - Using an ItemWriter with List of Lists

但仍然无法弄清楚如何使用 Spring Batch 来:

  1. 从输入 CSV 文件中读取一行。
  2. 处理它并生成一个或多个输出行。
  3. 将输出行写入输出文件。

我知道解决方案应该是实现一个将接受项目列表并以某种方式使用“委托”以逐个处理项目的编写器。

如果有人能对此有所了解,我将不胜感激。

我的代码:

public class CsvRowsProcessor implements ItemProcessor<RowInput, List<RowOutput>>{

    @Override
    public List<RowOutput> process(final RowInput rowInput)  {

        final String id = rowInput.getId();
        final String title = rowInput.getTitle();
        final String description = rowInput.getDescription();
        final RowOutput transformedRowInput = new RowOutput(id, title, description);

        List<RowOutput> rows=new LinkedList<>();
        rows.add(transformedRowInput);
        return rows;
    }

}

@Bean
ItemWriter<RowOutput> csvRowsWriter() {
    FlatFileItemWriter<RowOutput> csvFileWriter = new FlatFileItemWriter<>();
    csvFileWriter.setResource(new FileSystemResource("C:\\Users\\orenl\\IdeaProjects\\Spring-Batch-CSV-Example\\src\\main\\resources\\outputFile.csv"));
    LineAggregator<RowOutput> lineAggregator = createLineAggregator();
    csvFileWriter.setLineAggregator(lineAggregator);
    csvFileWriter.setHeaderCallback(new FlatFileHeaderCallback() {

        public void writeHeader(Writer writer) throws IOException {
            writer.write("Id,Title,Description");
        }
    });
    return csvFileWriter;
}



private LineAggregator<RowOutput> createLineAggregator() {
    DelimitedLineAggregator<RowOutput> lineAggregator = new DelimitedLineAggregator<>();
    lineAggregator.setDelimiter(",");

    FieldExtractor<RowOutput> fieldExtractor = createFieldExtractor();
    lineAggregator.setFieldExtractor(fieldExtractor);

    return lineAggregator;
}

private FieldExtractor<RowOutput> createFieldExtractor() {
    BeanWrapperFieldExtractor<RowOutput> extractor = new BeanWrapperFieldExtractor<>();
    extractor.setNames(new String[] { "Id", "Title", "Description" });
    return extractor;
}

@Bean
public Step csvFileToFileStep() {
    return stepBuilderFactory.get("csvFileToFileStep")
            .<RowInput ,RowOutput>chunk(1)
            .reader(csvRowsReader())
            .processor(csvRowsProcessor())
            .writer(csvRowsWriter())
            .build();
}

@Bean
Job csvFileToCsvJob(JobCompletionNotificationListener listener) {
    return jobBuilderFactory.get("csvFileToCsvJob")
            .incrementer(new RunIdIncrementer())
            .listener(listener)
            .flow(csvFileToFileStep())
            .end()
            .build();
}

标签: javaspring-batch

解决方案


这是一个例子:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.support.ListItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
@EnableBatchProcessing
public class MyJob {

    @Autowired
    private JobBuilderFactory jobs;

    @Autowired
    private StepBuilderFactory steps;

    @Bean
    public ItemReader<Integer> itemReader() {
        return new ListItemReader<>(Arrays.asList(1, 3, 5, 7, 9));
    }

    @Bean
    public ItemProcessor<Integer, List<Integer>> itemProcessor() {
        return item -> {
            List<Integer> result = new ArrayList<>();
            result.add(item);
            result.add(item + 1);
            return result;
        };
    }

    @Bean
    public ItemWriter<List<Integer>> itemWriter() {
        return items -> {
            for (List<Integer> item : items) {
                for (Integer integer : item) {
                    System.out.println("integer = " + integer);
                }
            }
        };
    }

    @Bean
    public Step step() {
        return steps.get("step")
                .<Integer, List<Integer>>chunk(2)
                .reader(itemReader())
                .processor(itemProcessor())
                .writer(itemWriter())
                .build();
    }

    @Bean
    public Job job() {
        return jobs.get("job")
                .start(step())
                .build();
    }

    public static void main(String[] args) throws Exception {
        ApplicationContext context = new AnnotationConfigApplicationContext(MyJob.class);
        JobLauncher jobLauncher = context.getBean(JobLauncher.class);
        Job job = context.getBean(Job.class);
        jobLauncher.run(job, new JobParameters());
    }

}

此示例读取一些数字,并为每个数字返回数字及其后继数字,然后将数字打印到标准输出。该示例显示了一个项目的处理如何返回多个项目。

它打印:

integer = 1
integer = 2
integer = 3
integer = 4
integer = 5
integer = 6
integer = 7
integer = 8
integer = 9
integer = 10

您可以调整示例以读取/写入文件。

希望这可以帮助。


推荐阅读