首页 > 解决方案 > 用于 INSERT INTO SELECT 模式的 Python 正则表达式/正则表达式

问题描述

我正在使用“import re”在 Python 中读取文件,我的代码是

>import re
>fh = open(r"ABC.PACKAGE", "r").read()
>
>pattern_string = "INSERT INTO .*";
>
>for line in re.findall(pattern_string, fh):
>    
>print(line)

现在我的文件中有三种类型的 INSERT 语句,如下所示

>2.  INSERT INTO CMS.TMP_SEARCH_TOKEN_CODES (
        SELECT TOKEN.CODE   
        FROM CMS.TOKEN 
        INNER JOIN CMS.VEHICLE_INFO ON TOKEN.VEHICLE_INFO_ID = VEHICLE_INFO.VEHICLE_INFO_ID
        AND VEHICLE_INFO.PLATE_NUMBER = P_VEHICLE_NO AND TOKEN.IS_ACTIVE = 1 AND VEHICLE_INFO.IS_ACTIVE = 1
        UNION ALL
        SELECT TOKEN.CODE   
        FROM CMS.TOKEN 
        INNER JOIN CMS.VEHICLE_LINK_DATA  ON VEHICLE_LINK_DATA.TOKEN_ID = TOKEN.TOKEN_ID
        AND PLATE_NUMBER = P_VEHICLE_NO AND TOKEN.IS_ACTIVE = 1 AND TOKEN.VEHICLE_LINK_TYPE_ID = 2
        AND VEHICLE_LINK_DATA.IS_ACTIVE = 1
        );                

>3. INSERT INTO CMS.MAPSEARCH_TABLE
                    SELECT DISTINCT
                            DENSE_RANK() OVER (PARTITION BY CUSTOMER.CUSTOMER_ID ORDER BY  CUSTOMER.CUSTOMER_ID, ,
                            CUSTOMER_TYPE.AR_CUSTOMER_TYPE AS CUSTOMER_TYPE_NAME_AR  
                        FROM  
                            CUSTOMER
                            INNER JOIN CUSTOMER_STATUS ON CUSTOMER_STATUS.CUSTOMER_STATUS_ID=CUSTOMER.STATUS_ID 
                            AND  (CUSTOMER.CUSTOMER_ID=P_CUSTOMERID  OR P_CUSTOMERID IS NULL)
                            LEFT OUTER JOIN CUSTOMER_TYPE ON CUSTOMER.CUSTOMER_TYPE_ID = CUSTOMER_TYPE.CUSTOMER_TYPE_ID
                            LEFT OUTER JOIN BENEFICIARY ON
                            (BENEFICIARY.CUSTOMER_ID = P_CUSTOMERID   OR P_CUSTOMERID IS NULL)
                            AND  CUSTOMER.CUSTOMER_ID = BENEFICIARY.CUSTOMER_ID
                        WHERE
                           (P_BENEFICIARYCODE IS NULL OR UPPER(BENEFICIARY.CODE) LIKE '%' ||UPPER(P_BENEFICIARYCODE)||'%')
                           AND(P_BENEFICIARYNAME IS NULL OR UPPER(BENEFICIARY.NAME) LIKE '%' ||UPPER(P_BENEFICIARYNAME)||'%')
                                       ORDER BY BENEFICIARY.BENEFICIARY_ID;

我想打印具有 2 和 3 等模式的查询,但我无法生成正则表达式。它应该类似于以 INSERT INTO .* 开头,然后是 (SELECT( 或 SELECT( 并以分号结尾。

我不使用 SQL PARSER 但想使用正则表达式

标签: pythonregexexpressionsql-insert

解决方案


假设,pattern like 2 and 3你的意思是基于选择的插入语句,你可以试试这个: r"INSERT INTO[^;]*SELECT[^;]*;"

由于更新了问题描述而更新

import re

data = """
1. INSERT INTO TABLE1 (id,name)VALUES('123','ABS',);
2. INSERT INTO TABLE1 (SELECT(..blah..blah));
5. UPDATE ...;
3. INSERT INTO TABLE1 SELECT(..blah..blah);
4. DELETE FROM ...;
5. INSERT INTO CMS.TMP_SEARCH_TOKEN_CODES ( 
       SELECT TOKEN.CODE 
         FROM CMS.TOKEN INNER JOIN CMS.VEHICLE_INFO ON TOKEN.VEHICLE_INFO_ID = VEHICLE_INFO.VEHICLE_INFO_ID UNION ALL SELECT TOKEN.CODE FROM CMS.TOKEN INNER JOIN CMS.VEHICLE_LINK_DATA ON VEHICLE_LINK_DATA.TOKEN_ID = TOKEN.TOKEN_ID );
6. UPDATE ....;
"""
pattern_string = r"INSERT INTO[^;]*SELECT[^;]*;"
q = re.findall(pattern_string, data, re.MULTILINE | re.DOTALL)
print(q)

更新读取/写入文件,删除换行符:

import os
import re

with open("ABC.PACKAGE", "r") as fin, open("OUT.SQL", "w") as fout:
    statements = re.findall(r"INSERT INTO[^;]*SELECT[^;]*;", fin.read(), re.MULTILINE | re.DOTALL)
    statements = [re.sub(r'\r?\n|\s+', ' ', s) for s in statements]
    fout.write(os.linesep.join(statements))

推荐阅读