regex - 根据特殊字符的存在和 Postgres 中的字符串匹配将列值拆分为多行

问题描述

我在 Postgres11 中有下表

col1            col2                       col3
NCT00065442 APC-Placebo                    apc-placebo
NCT00135226 Placebo                        placebo
NCT00146640 MR Prednisone                  mr prednisone
NCT00146640 Placebo - IR Prednisone        placebo - ir prednisone

如果字符串有安慰剂并且有“-”特殊字符，我想拆分 col3。

所需的输出是：

col1            col2                       col3
NCT00065442 APC-Placebo                    apc
NCT00065442 APC-Placebo                    placebo
NCT00135226 Placebo                        placebo
NCT00146640 MR Prednisone                  mr prednisone
NCT00146640 Placebo - IR Prednisone        placebo
NCT00146640 Placebo - IR Prednisone        ir prednisone

到目前为止，我已经尝试过以下查询。

select *, 
case when col3 ilike '%placebo%' and col3 ~* '-'
        then unnest(string_to_array(col3, '-'))
     else col3
end
from table 
order by col1;

我还尝试通过以下方式替换 unnest(string_to_array) 函数：

UNNEST(REGEXP_SPLIT_TO_ARRAY(t.name, '\s*[-]\s*'))

标签： regexpostgresqlsplit

表达式unnest()中不允许使用类似的函数。你可以这样做：CASE
UNION ALL

select col1, col2, trim(unnest(string_to_array(col3, '-'))) col3
from tablename 
where col3 like '%placebo%'
union all
select col1, col2, col3                
from tablename 
where col3 not like '%placebo%'

见演示。
结果：

| col1        | col2                    | col3          |
| ----------- | ----------------------- | ------------- |
| NCT00065442 | APC-Placebo             | apc           |
| NCT00065442 | APC-Placebo             | placebo       |
| NCT00135226 | Placebo                 | placebo       |
| NCT00146640 | Placebo - IR Prednisone | placebo       |
| NCT00146640 | Placebo - IR Prednisone | ir prednisone |
| NCT00146640 | MR Prednisone           | mr prednisone |

regex - 根据特殊字符的存在和 Postgres 中的字符串匹配将列值拆分为多行

问题描述

解决方案

推荐阅读