首页 > 解决方案 > Spring Batch - 从 AWS S3 读取文件以处理

问题描述

我正在编写一个需要从 AWS S3 存储桶读取文件的 Spring Batch 应用程序。

这是我的 AWS Config Java 类,

@Configuration
public class AWSConfig{

    @Value("${cloud.aws.credentials.accessKey}")
    private String accessKey;

    @Value("${cloud.aws.credentials.secretKey}")
    private String secretKey;

    @Value("${cloud.aws.region}")
    private String region;

    @Bean
    public BasicAWSCredentials basicAWSCredentials() {
        return new BasicAWSCredentials(accessKey, secretKey);
    }

    @Bean
    public AmazonS3Client amazonS3Client(AWSCredentials awsCredentials) {
        AmazonS3Client amazonS3Client = (AmazonS3Client) AmazonS3ClientBuilder.standard()
                .withCredentials(new AWSStaticCredentialsProvider(awsCredentials))
                .withRegion(region)
                .build();
        return amazonS3Client;
    }

}

这是我的 aws-context.xml(位于 resources/)文件,用于修改默认的 ResourceLoader,

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:aws-context="http://www.springframework.org/schema/cloud/aws/context"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans.xsd
        http://www.springframework.org/schema/cloud/aws/context
        http://www.springframework.org/schema/cloud/aws/context/spring-cloud-aws-context.xsd">

    <aws-context:context-resource-loader amazon-s3="amazonS3Client" />

</beans>

这是我的 SpringBatchConfig.java 类,

@Configuration
@EnableBatchProcessing
public class SpringBatchConfig {

    @Autowired
    private ResourceLoader resourceLoader;

    @Bean
    public Job job(JobBuilderFactory jobBuilderFactory,
                   StepBuilderFactory stepBuilderFactory,
                   ItemReader<User> itemReader,
                   ItemProcessor<User, User> itemProcessor,
                   ItemWriter<User> itemWriter
    ) {

        Step step = stepBuilderFactory.get("ETL-file-load")
                .<User, User>chunk(100)
                .reader(itemReader)
                .processor(itemProcessor)
                .writer(itemWriter)
                .build();


        return jobBuilderFactory.get("ETL-Load")
                .incrementer(new RunIdIncrementer())
                .start(step)
                .build();
    }

    @Bean
    public FlatFileItemReader<User> itemReader() throws IOException {

        FlatFileItemReader<User> flatFileItemReader = new FlatFileItemReader<>();
        flatFileItemReader.setResource(resourceLoader.getResource("s3://" + "<bucket-name>" + "/" + "<key>"));
        flatFileItemReader.setName("CSV-Reader");
        flatFileItemReader.setLinesToSkip(1);
        flatFileItemReader.setLineMapper(lineMapper());
        return flatFileItemReader;
    }

    @Bean
    public LineMapper<User> lineMapper() {

        DefaultLineMapper<User> defaultLineMapper = new DefaultLineMapper<>();
        DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();

        lineTokenizer.setDelimiter(",");
        lineTokenizer.setStrict(false);
        lineTokenizer.setNames(new String[]{"id", "name", "dept", "salary"});

        BeanWrapperFieldSetMapper<User> fieldSetMapper = new BeanWrapperFieldSetMapper<>();
        fieldSetMapper.setTargetType(User.class);

        defaultLineMapper.setLineTokenizer(lineTokenizer);
        defaultLineMapper.setFieldSetMapper(fieldSetMapper);

        return defaultLineMapper;
    }

}

我已按照此 StackOverflow 线程中@mtoutcalt 给出的答案进行配置, Spring Batch - 从 Aws S3 读取文件

还有这个文档:https ://cloud.spring.io/spring-cloud-aws/spring-cloud-aws.html#_resource_handling

我面临的问题,

1)在 SpringBatchConfig.java 中,当它尝试自动装配 ResourceLoader 时,它说(我正在使用 IntelliJIdea),

Could not autowire. There is more than one bean of 'ResourceLoader' type.
Beans:
    (aws-config.xml) webApplicationContext   (Spring Web) 

2)当我运行批处理应用程序时,它说,

Caused by: java.lang.IllegalStateException: Input resource must exist (reader is in 'strict' mode): ServletContext resource [/s3://<bucket-name>/<key>]

有人可以帮忙解决这个问题吗?

问候

标签: javaspring-bootrestamazon-s3spring-cloud

解决方案


推荐阅读