spring - Spring批处理文件读取器记录在记录中具有不同的分隔符
问题描述
我有以下示例输入文件,其中元素在第三个“|”之后可以是任何大小 分隔符。一个人可以有任意数量的地址,用“,”分隔,每个地址元素用“:”分隔符分隔。您能否告知是否有任何文件阅读器可以处理这种数据记录?谢谢
id1|name1|male|1:new york:NY:10019, 2:philadelphia:PA:19382, 3:columbus:OH:23415|USA
id2|name2|female|1:new york:NY:10019, 2:philadelphia:PA:19382, 3:columbus:OH:23415, 4:west chester:PA:19341|USA
id3|name3|male|1:new york:NY:10019|USA
id4|name4|female|1:new york:NY:10019, 2:philadelphia:PA:19382|USA
解决方案
这是一个自定义要求,在 Spring Batch 中没有内置的方法可以做到这一点。但是,您可以FlatFileItemReader
使用自定义的FieldSetMapper
. 这是一个简单的例子:
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.FileSystemResource;
import org.springframework.validation.BindException;
@Configuration
@EnableBatchProcessing
public class MyJob {
@Bean
public FlatFileItemReader<Person> itemReader() {
DefaultLineMapper<Person> lineMapper =new DefaultLineMapper<>();
lineMapper.setLineTokenizer(new DelimitedLineTokenizer("|"));
lineMapper.setFieldSetMapper(new PersonMapper());
return new FlatFileItemReaderBuilder<Person>()
.name("personItemReader")
.resource(new FileSystemResource("persons.csv"))
.lineMapper(lineMapper)
.build();
}
@Bean
public Job job(JobBuilderFactory jobs, StepBuilderFactory steps) {
return jobs.get("job")
.start(steps.get("step")
.chunk(2)
.reader(itemReader())
.writer(items -> items.forEach(System.out::println))
.build())
.build();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(MyJob.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
Job job = context.getBean(Job.class);
jobLauncher.run(job, new JobParameters());
}
static class Person {
String id, name, gender, country;
List<String> addresses = new ArrayList<>(); // TODO create and use Address class instead of String
@Override
public String toString() {
return "Person{" +
"id='" + id + '\'' +
", name='" + name + '\'' +
", gender='" + gender + '\'' +
", country='" + country + '\'' +
", addresses=" + addresses +
'}';
}
}
static class PersonMapper implements FieldSetMapper<Person> {
@Override
public Person mapFieldSet(FieldSet fieldSet) throws BindException {
Person p = new Person();
p.id = fieldSet.readString(0);
p.name = fieldSet.readString(1);
p.gender = fieldSet.readString(2);
p.addresses.addAll(Arrays.asList(fieldSet.readString(3).split(","))); // TODO split address as needed
p.country = fieldSet.readString(4);
return p;
}
}
}
将您的文件作为输入,将打印:
Person{id='id1', name='name1', gender='male', country='USA', addresses=[1:new york:NY:10019, 2:philadelphia:PA:19382, 3:columbus:OH:23415]}
Person{id='id2', name='name2', gender='female', country='USA', addresses=[1:new york:NY:10019, 2:philadelphia:PA:19382, 3:columbus:OH:23415, 4:west chester:PA:19341]}
Person{id='id3', name='name3', gender='male', country='USA', addresses=[1:new york:NY:10019]}
Person{id='id4', name='name4', gender='female', country='USA', addresses=[1:new york:NY:10019, 2:philadelphia:PA:19382]}
如代码注释中所述,您现在可以为地址创建域类并根据需要解析第 4 个字段。
推荐阅读
- jpa - 扩展使用 @MappedSuperClass 注释映射的 Super 类的问题
- android - 材质组件 TextInputEditText 底线边距
- sql - SQL 分解一个 int 并将每个数字相乘
- javascript - 如何使用 JS 将 blob 对象显示为图像
- angular - lodash - _.groupBy 在用它创建数组之前对结果进行排序
- javascript - 动态更新 react-google-chart 类型
- android - viewpager 内的布局高度,其高度设置为包裹内容
- javascript - IE 说 javascript 函数未定义,但在 Chrome 中可以正常工作
- sql - 如何使用 SSIS 将谷歌电子表格数据加载到 SQL Server 表中?
- php - Wordpress 付费会员专业版未在结帐时将用户添加到会员列表