首页 > 解决方案 > PLSQL - REGEXP_SUBSTR 删除 XML 标签

问题描述

我目前有一个由 oraclexml网关生成的有效负载,我需要从中提取一些确切的信息。有效负载信息存储在 db 表中,这意味着我正试图让我们regexp_substr完成这项任务。

这是位于 XML 文档中间的标记

<IDCODE>S2200</IDCODE>

    "<?xml version="1.0" encoding="UTF-8" standalone='no'?>
<!DOCTYPE PROCESS_INVOICE_002 SYSTEM "asfasdf.dtd">
<!-- Oracle eXtensible Markup Language Gateway Server  -->
<PROCESS_INVOICE_002>
  <CNTROLAREA>
    <BSR>
      <VERB value="PROCESS"/>
      <NOUN value="INVOICE"/>
      <REVISION value="002"/>
    </BSR>
    <SENDER>
      <LOGICALID/>
      <COMPONENT/>
      <TASK/>
      <REFERENCEID/>
      <CONFIRMATION/>
      <LANGUAGE/>
      <CODEPAGE/>
      <AUTHID/>
    </SENDER>
    <DATETIME qualifier="CREATION">
      <YEAR/>
      <MONTH/>
      <DAY/>
      <HOUR/>
      <MINUTE/>
      <SECOND/>
      <SUBSECOND/>
      <TIMEZONE/>
    </DATETIME>
  </CNTROLAREA>
  <DATAAREA>
    <PROCESS_INVOICE>
      <INVHEADER>
        <AMOUNT qualifier="DOCUMENT" type="T" index="1">
          <VALUE>78538</VALUE>
          <NUMOFDEC>8</NUMOFDEC>
          <SIGN>+</SIGN>
          <CURRENCY>USD</CURRENCY>
          <DRCR>D</DRCR>
        </AMOUNT>
        <DATETIME qualifier="DOCUMENT" index="1">
          <YEAR>2020</YEAR>
          <MONTH>11</MONTH>
          <DAY>28</DAY>
          <HOUR>00</HOUR>
          <MINUTE>00</MINUTE>
          <SECOND>00</SECOND>
          <SUBSECOND>0000</SUBSECOND>
          <TIMEZONE>+0000</TIMEZONE>
        </DATETIME>
        <DOCUMENTID>81989184</DOCUMENTID>
        <DESCRIPTN/>
        <DOCTYPE>INV</DOCTYPE>
        <PAYMETHOD/>
        <REASONCODE/>
        <USERAREA>
          <NOTEREFCODE/>
          <NOTESREF/>
          <VENDNUMQUAL>IA</VENDNUMQUAL>
          <VENDNUM>98181</VENDNUM>
          <DEPTNUMQUAL>DP</DEPTNUMQUAL>
          <DEPTNUM>85</DEPTNUM>
          <ORDNUMQUAL/>
          <ORDNUM>0</ORDNUM>
          <CUSTCODEQUAL/>
          <CUSTCODE/>
          <NETDAYS/>
          <DATETIMEQUAL/>
          <FOBCODE/>
          <UOM/>
          <TOTALQUANTITY/>
        </USERAREA>
        <PARTNER>
          <NAME index="1">COMPANY NAME</NAME>
          <ONETIME/>
          <PARTNRID/>
          <PARTNRTYPE>Supplier</PARTNRTYPE>
          <SYNCIND/>
          <ACTIVE/>
          <CURRENCY/>
          <DESCRIPTN/>
          <DUNSNUMBER/>
          <GLENTITYS/>
          <PARENTID/>
          <PARTNRIDX/>
          <PARTNRRATG/>
          <PARTNRROLE/>
          <PAYMETHOD/>
          <TAXEXEMPT/>
          <TAXID/>
          <TERMID/>
          <USERAREA>
            <IDQUAL/>
            <IDCODE/>
          </USERAREA>
          <CONTACT>
            <NAME index="1">PROFILE</NAME>
            <CONTCTTYPE/>
            <DESCRIPTN/>
            <EMAIL/>
            <FAX index="1"/>
            <TELEPHONE index="1"/>
            <USERAREA/>
          </CONTACT>
        </PARTNER>
        <PARTNER>
          <NAME index="1">CUSTOMER NAME</NAME>
          <ONETIME/>
          <PARTNRID>981698198</PARTNRID>
          <PARTNRTYPE>ShipTo</PARTNRTYPE>
          <SYNCIND/>
          <ACTIVE/>
          <CURRENCY/>
          <DESCRIPTN/>
          <DUNSNUMBER/>
          <GLENTITYS/>
          <PARENTID/>
          <PARTNRIDX/>
          <PARTNRRATG/>
          <PARTNRROLE/>
          <PAYMETHOD/>
          <TAXEXEMPT/>
          <TAXID/>
          <TERMID/>
          <USERAREA>
            <IDQUAL>ZZ</IDQUAL>
            <IDCODE>S2200</IDCODE>
          </USERAREA>
          <ADDRESS>
            <ADDRLINE index="1">123 MAIN STREET</ADDRLINE>
            <ADDRTYPE/>
            <CITY>HAM CITY</CITY>
            <COUNTRY>United States</COUNTRY>
            <COUNTY>NEW YORK</COUNTY>
            <DESCRIPTN/>
            <FAX index="1"/>
            <POSTALCODE>18080</POSTALCODE>
            <REGION/>
            <STATEPROVN>NY</STATEPROVN>
            <TAXJRSDCTN/>
            <TELEPHONE index="1"/>
            <URL/>
            <USERAREA/>
          </ADDRESS>

我在查询中使用的正则表达式

TRIM(regexp_substr(ed.payload, '?.+(</IDCODE>)')) Store_NUM,
TRIM(regexp_substr(ed.payload, '(^IDCODE)?.+(</IDCODE>)')) Store_Number

我从上述 SQL 收到的结果regexp_substr。问题是我已经进入了正确的选项卡,但我不知道如何去除输出的\<IDCODE>\</IDCODE>

- 字段可以有 4 或 5 个字符 - 字母或数字

<IDCODE>S2200</IDCODE> Store_NUM

<IDCODE>S2200</IDCODE> Store_Number

标签: sqlxmloracleplsqlpayload

解决方案


如果我正确理解您,我相信您正在寻找这个。返回标签之间组中的所有内容。

SELECT REGEXP_SUBSTR('<IDCODE>S2200</IDCODE>', '<IDCODE>(.*)</IDCODE>', 1, 1, NULL, 1) Store_Number
from dual;


STORE_NUMBER
------------
S2200       
1 row selected.

推荐阅读