首页 > 解决方案 > 当有多个模式时,spring batch PatternMatchingCompositeTokenizer

问题描述

我有一个文件要阅读,如下所示。一条记录被分成多行。每条记录可以有任意数量的行;识别新记录的唯一方法是当一行以“ABC”开头并且另一行具有标识符 ABC_SUB。此记录的每一行都需要有一个单独的映射器,该映射器由以行开头的模式标识(例如,ABC、line2、line3、line、ABC_SUB、line3、line4) 可以存在相同的模式 line3 和 line4,但它们需要基于前一个线型标识符的不同映射器。在这个例子中,

  1. line3 pattern(with mapper1) 存在于以 ABC 和开头的行之后
  2. line3 模式(使用 mapper2)在以 ABC_SUB 开头的行之后退出。

如何识别 ABC 下的 line3 模式与 ABC_SUB 下的 line3 模式?

我尝试了 PatternMatchingCompositeTokenizer,但这给出了第一个匹配映射器。

有没有办法在识别子类型(如 ABC 或 ABC_SUB)并给出相应的映射器之前检查几行?

HDR
ABCline1goesonforrecord1   //record starts
line2goesonForRecord1      
line3goesonForRecord1       //this requires ABC_line3_mapper    
line4goesonForRecord1
ABC_SUBline1goesonforrecord1  //sub-record where it can have same pattern in below lines
line3goesonForRecord1       //this requires ABC_SUB_line3_mapper  
line4goesonForRecord1
ABCline2goesOnForRecord2  //record 2 begins
line2goesonForRecord2
line3goesonForRecord2
line4goesonForRecord2
line5goesonForRecord2
ABCline2goesOnForRecord3
line2goesonForRecord3
line3goesonForRecord3
line4goesonForRecord3
TRL

下面是 XML 配置

<batch:job id="importFileData">
        <batch:step id="parseAndLoadData">
            <batch:tasklet>
                <batch:chunk reader="multiLineReader" writer="writer"
                    commit-interval="5" skip-limit="100">
                    <batch:streams>
                        <batch:stream ref="fileItemReader" />
                    </batch:streams>
                </batch:chunk>

            </batch:tasklet>
        </batch:step>

    </batch:job>


    <bean id="fileItemReader"
        class="org.springframework.batch.item.file.FlatFileItemReader"
        scope="step">
        <property name="resource" value="classpath:input/input.txt"></property>
        <property name="linesToSkip" value="2" />
        <property name="lineMapper">
            <bean
                class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer" ref="customLineTokenizer">
                </property>
                <property name="fieldSetMapper">
                    <bean
                        class="org.springframework.batch.item.file.mapping.PassThroughFieldSetMapper">


                    </bean>
                </property>

            </bean>
        </property>

    </bean>


    <bean id="reader" class="sample.MultiLineReader">
        <property name="fieldSetReader" ref="fileItemReader" />
        <property name="abcMapper" ref="abcMapper" />
        <property name="123Mapper" ref="sample123Mapper" />
    </bean>

    <bean id="orderFileTokenizer"
        class="org.springframework.batch.item.file.transform.PatternMatchingCompositeLineTokenizer">
        <property name="tokenizers">
            <map>
                <entry key="ABC*" value-ref="abcLineTokenizer" />
                <entry key="123*" value-ref="123LineTokenizer" />
            </map>
        </property>
    </bean>

    <bean id="abcLineTokenizer"
        class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
        <property name="names" value="NAME, AGE, GENDER" />
        <property name="columns" value="1-10,11-15,16-20" />
        <property name="strict" value="false" />
    </bean>

    <bean id="123LineTokenizer"
        class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
        <property name="names" value="CONTACT, ALT_CONTACT" />
        <property name="columns" value="1-15,16-30" />
        <property name="strict" value="false" />
    </bean>
    
    <bean id="abcMapper" class="sample.ABCMapper" />
    <bean id="sample123Mapper" class="sample.123Mapper" />

</beans>

标签: spring-batchspring-batch-stream

解决方案


推荐阅读