首页 > 解决方案 > 通过分隔符拆分电子邮件列表返回超过预期的行数

问题描述

我有以下架构和 SQL 脚本:http ://www.sqlfiddle.com/#!4/9c7bb/9

create table a(
 cond varchar2(400),
 email varchar2(400) 
);
insert into a values('CPU_consumption', 'name_1@email.com;name_2@email.com;name_3@email.com');
insert into a values('number of processes', 'name_1@email.com');
insert into a values('memory consumption', 'name_1@email.com;name_2@email.com;');

SELECT 
 regexp_substr(email,'((.*?)*?)(;|$)',1,level,NULL,1) AS result
FROM a
WHERE cond = 'memory consumption'
CONNECT BY level < regexp_count(email,'((.*?)*?)(;|$)');

结果:

name_2@email.com
name_2@email.com
name_1@email.com
name_2@email.com

谁能解释为什么结果不仅是 2 行的电子邮件地址?为什么显示第二封电子邮件;地址两次,然后是所需的列表?

标签: sqlregexoracle

解决方案


因为与其他值connect by混合。memory consumption如果您在查询中添加路径和根目录,则可以检查它:

select level, regexp_substr(email,'((.*?)*?)(;|$)', 1, level, null, 1) as result, 
       connect_by_root(cond) root, sys_connect_by_path(cond, ' -> ') path
  from a 
  where cond = 'memory consumption' 
  connect by level < regexp_count(email,'((.*?)*?)(;|$)');

您应该从正确的值开始,并在接下来的步骤中检查它:

select level, regexp_substr(email,'((.*?)*?)(;|$)', 1, level, null, 1) as result
  from a 
  connect by level < regexp_count(email,'((.*?)*?)(;|$)') and cond = 'memory consumption'
  start with cond = 'memory consumption';

小提琴


推荐阅读