首页 > 解决方案 > EDI 文件上的正则表达式

问题描述

嗨,伙计们,我有一个 EDI 文件,其中包含数量、交货日期等信息。现在我想用正则表达式拆分它,这样我就可以用所需的信息拆分行。所以附上你找到文件内容。我尝试使用 LIN+.* 或 LIN+.*? 之类的表达式?但后来我只把所有的 LIN 段放在一起,或者把 LIN 段分开,但信息较少。我想将每个 LIN 元素与之后的全部信息分开。有人可以帮助我吗?

UNB+UNOA:2+094200005561400986LA:ZZ+MTEL+200406:1436+34906++++1'UNH+112490+DELFOR:D:96A:UN'BGM+241+2004060008796+9'DTM+137:202004061436:203'DTM+157:20200406:102'DTM+36:20200206:102'NAD+BY+FRSFA0222838V::92'NAD+SE+000563X::92'UNS+D'NAD+CN+VP1::92++TEST+SK TEST:204 TEST:TEST 22:TEST ST TEST+++37540+FRA'LIN+1+3+441344:IN'PIA+1+7PK1150:VN'IMD+++:::VO-VKMV 7PK1150 VP'LOC+11+999'LOC+159+999'RFF+ON:P092303'QTY+113:100.00:PC'SCC+1'DTM+2:20200116:102'RFF+AAJ:P092303:100'QTY+113:100.00:PC'SCC+1'DTM+2:20200206:102'RFF+AAJ:P092304:100'LIN+2+3+502107:IN'PIA+1+3PK670:VN'IMD+++:::VO-VKMV 3PK670 EDC'LOC+11+999'LOC+159+999'RFF+ON:P088273'QTY+113:300.00:PC'SCC+1'DTM+2:20190503:102'RFF+AAJ:P088273:100'LIN+3+3+502109:IN'PIA+1+6PK970:VN'IMD+++:::VO-VKMV 6PK970 EDC'LOC+11+999'LOC+159+999'RFF+ON:P084470'QTY+113:200.00:PC'SCC+1'DTM+2:20190422:102'RFF+AAJ:P084470:100'LIN+4+3+6DK1215:IN'PIA+1+AVRRV50D1-VKMV 6DK1215:VN'IMD+++:::6DK1215'LOC+11+999'LOC+159+999'RFF+ON:P046369'QTY+48:533.00:PC'RFF+AAK:32299'DTM+171:20181109:102'QTY+113:533.00:PC'SCC+1'DTM+2:20190419:102'RFF+AAJ:P046369:100'LIN+5+3+6DK1320:IN'PIA+1+AVRRV50D1-VKMV 6DK1320?+282:VN'IMD+++:::6DK1320'LOC+11+999'LOC+159+999'RFF+ON:P061903'QTY+48:115.00:PC'RFF+AAK:43146'DTM+171:20181003:102'QTY+113:104.00:PC'SCC+1'DTM+2:20181005:102'RFF+AAJ:P061903:100'QTY+113:104.00:PC'SCC+1'DTM+2:20181102:102'RFF+AAJ:P062034:100'UNS+S'UNT+75+112490'UNZ+1+34906' ```

标签: regexedi

解决方案


您可以使用

LIN(?:(?!LIN).)*

或者,更高效的版本(遵循展开循环原则):

LIN[^L]*(?:L(?!IN)[^L]*)*

请参阅正则表达式演示 #1正则表达式演示 #2

缓和的(?:(?!LIN).)* 贪婪令牌模式匹配任何.不以LIN字符序列开头的 char ( ) 0 次或更多次,但尽可能多。

[^L]*(?:L(?!IN)[^L]*)*模式匹配除 以外的任何 0 个或多个字符,然后匹配 0 个或多个不跟随L的序列,然后匹配除 0+ 以外的字符。LINL


推荐阅读