regex - 在postgresql上使用正则表达式基于两列的方法提取所有行

问题描述

我想提取所有具有两列含义的行不同或不基于postgresql中的正则表达式。

例子：

col1                    | col2         | result
------------------------+--------------+-----------------------
teste 1452-251-99 azert | 1425-251-99  | Same meaning
teste 1225-71-45        | 1225--71.45  | Same meaning
teste 1288-91-75        | 1225--71.45  | Not the same meaning

col2 列的格式必须是\d{3,6}-\d{3}-\d{2}，我认为该列是正确的值

我找不到正确的查询，这是我的尝试：

update my_table 
set result = 'Not the same meaning' 
where id in (select t.id from my_table t
             where col1 ~'\d{3,6}-\d{3}-\d{2}' 
               and col1 not like format('%%%s%%', col2)
);

但这仅在两列不相同的情况下返回

col1                    | col2         | result
------------------------+--------------+-----------------------
teste 1452-251-99 azert | 1425-251-99  | Not the same meaning
teste 1225-71-45        | 1225--71.45  | Not the same meaning
teste 1288-91-75        | 1225--71.45  | Not the same meaning

标签： regexpostgresql

您需要col2用连字符替换数字之间的所有特殊字符-，然后使用正则表达式检查单词边界内是否存在此模式的匹配项：

where col1 ~'\d{3,6}-\d{3}-\d{2}' 
and 
col1 !~ CONCAT('\y', REGEXP_REPLACE(col2, '(?<=\d)[^[:space:]0-9]+(?=\d)', '-', 'g'), '\y')

对于中的1225--71.45值col2，该CONCAT('\y', REGEXP_REPLACE(col2, '(?<=\d)[^[:space:]0-9]+(?=\d)', '-', 'g'), '\y')部分将产生，并且当不包含单词字符时\y1225-71-45\y，它将作为“整个单词”匹配。1225-71-45请参阅(?<=\d)[^[:space:]0-9]+(?=\d)此处的正则表达式演示。

regex - 在postgresql上使用正则表达式基于两列的方法提取所有行

问题描述

解决方案

推荐阅读