首页 > 解决方案 > 我想读取文件并使用 AWK 存储一些变量

问题描述

我有一个包含以下内容的文件。它是在设备中查询的结果,因此预计在数据库中找不到某些输入。以下示例是成功和不成功查询的结果。我的意思是第二个示例没有我想要捕获到变量中的所有信息,所以我想忽略这个结果并将变量设置为空/空值。

<INTLPO:ISV=PORTAB NTL="6130290095" VEM=NAO;
VECTURA - SS            BSA002             2020-09-12            09-32
INTLPO:ISV=PORTAB NTL="6130290095" VEM=NAO;
INTERROGACAO DE NUMERO TELEFONICO PARA PORTABILIDADE NUMERICA                   

  TIPO DE ENCAMINHAMENTO POR ASSINANTE
  NTL = 6130290095           OPC = S_INF    RNP = 551      CSP = 25
  EIP = S_INF
  CDO = 00961
  CNL = 61000                NUF = S_INF                   TPB = PREST
  CPT = NAO                  CRE = 125      NUE = S_INF
  DAT = 2014-04-16           HOR = 10:30:20.798609
  TBR = 25
  RST              MAN      RST              MAN      RST              MAN
  2%               934      3%               934      4%               934
  5%               934      6%               934      7%               934
  8%               934      9%               934      9090%            934
  0??%             934      90??%            934      0?0%             934


  TOTAL DE NUMEROS ASSOCIADOS AO SERVICO: 1
<INTLPO:ISV=PORTAB NTL="6160150178" VEM=NAO;
VECTURA - SS            BSA002             2020-09-12            09-32
INTLPO:ISV=PORTAB NTL="6160150178" VEM=NAO;
INTERROGACAO DE NUMERO TELEFONICO PARA PORTABILIDADE NUMERICA                   

  ME:  NENHUM NUMERO CADASTRADO ATENDE AS ESPECIFICACOES

我有以下部分可以的代码。结果还是有点混乱(行重复,甚至值错误)。

awk -F ' ' 'BEGIN { OFS="," }
            /^VECTURA/ { equipment = $4; data = $5 }
            /^INTLPO/ { numero = $2}
            /^\s*NTL/ { ntl = $3 ; opc = $6; rnp = $9; csp = $12}
            /^\s*EIP/ { eip = $3}
            /^\s*CDO/ { cdo = $3}
            /^\s*CNL/ { cnl = $3; nuf = $6; tpb = $9}
            /^\s*CPT/ { cpt = $3; cre = $6; nue = $9}
            /^\s*DAT/ { dat = $3; hor = $6}
            /^\s*TBR/ { tbr = $3}
            /^\s*RST/ { man = $2; next}
            { print data, equipment, numero, ntl, opc, rnp, csp, eip, cdo, cnl, nuf, tpb, cpt, cre, nue, dat, hor, tbr, man}' input.tx >> output.txt

结果

2020-09-12,BSA002,6160150536,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6160150536,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN

请注意,记录 6130290095(变量 NTL)错误地与“数字”记录相关联(上例的最后几行)。

我怎么能克服呢?我尝试了一些 AWK 条件语句,但也没有成功。作为输出,我只想逐行记录,因为输出示例的某些行可以举例说明。多谢。

标签: regexshellawkreadfiletext-processing

解决方案


当您只想更改numero未设置时的值时,请添加一个测试,例如numero ||.
阅读您的评论后,我改变了我的解决方案。据我所知,您不希望一条记录将所有块的结果组合在一起,但您希望每个处理的块都有一条结果行。每个新块都以<INTLPO.
此解决方案将使所有值在新块开始时为空(第一个块不需要,但不会有害)。
当找到一个新块以及当我们位于文件末尾时,将显示一个块的结果。

awk 'function newrecord() {
        recordnumber++;
        data=equipment=numero=ntl=opc=rnp=csp=eip=cdo="";
        cnl=nuf=tpb=cpt=cre=nue=dat=hor=tbr=man="";
     }
     function printrecord() {
         print data, equipment, numero, ntl, opc, rnp, csp, eip,
               cdo, cnl, nuf, tpb, cpt, cre, nue, dat, hor, tbr, man;
     }

     BEGIN { OFS="," }
            /^<INTLPO/ { if (recordnumber) printrecord(); newrecord(); }
            /^VECTURA/ { equipment = $4; data = $5 }
            /^INTLPO/ { numero = $2}
            /^\s*NTL/ { ntl = $3 ; opc = $6; rnp = $9; csp = $12}
            /^\s*EIP/ { eip = $3}
            /^\s*CDO/ { cdo = $3}
            /^\s*CNL/ { cnl = $3; nuf = $6; tpb = $9}
            /^\s*CPT/ { cpt = $3; cre = $6; nue = $9}
            /^\s*DAT/ { dat = $3; hor = $6}
            /^\s*TBR/ { tbr = $3}
            /^\s*RST/ { man = $2; next}
            END { printrecord(); }
      ' input.tx

推荐阅读