首页 > 解决方案 > 在雪花中按索引映射两个数组

问题描述

我有一个包含两列逗号分隔值的表。我想按索引映射它们

item_list             item_type
'400,500,600,700'    'st1,st2,st2'

期望的输出

item    type
400     st1
500     st2
600     st2
700     NULL

我尝试使用以下内容,但它只是重复值或每个项目。

select distinct a.value::string as a1
       ,b.value::string as a2
from (select '400,,500,600,700'as c1
         ,'st1,st2,st2' as c2
     ) as x,
     lateral flatten(input=>split(c1, ',')) a,
     lateral flatten(input=>split(c2, ',')) b
order by a1;

标签: snowflake-cloud-data-platformsnowflake-schema

解决方案


一种方法是将它们分别展平,并加入它们的索引:

with 
x as (select '400,500,600,700'as c1,'st1,st2,st2' as c2),
a as (select value::string a1, index from x, lateral flatten(input=>split(x.c1, ','))),
b as (select value::string a2, index from x, lateral flatten(input=>split(x.c2, ',')))
select a1,a2 from a full outer join b on a.index=b.index;

同样,使用 SPLIT_TO_TABLE():

with 
x as (select '400,500,600,700'as c1,'st1,st2,st2' as c2),
a as (select value::string a1, index from x, lateral split_to_table(x.c1, ',')),
b as (select value::string a2, index from x, lateral split_to_table(x.c2, ','))
select a1,a2 from a full outer join b on a.index=b.index;

@Andrii Soldatenko 进一步澄清——谢谢:

为了解释想法,我建议先运行这个:

select * from(select 'st1,st2' as c2) as x2,
                                            lateral flatten(input=>split(c2, ',')) b;
+---------+-----+------+------+-------+-------+----------+
| C2      | SEQ | KEY  | PATH | INDEX | VALUE | THIS     |
|---------+-----+------+------+-------+-------+----------|
| st1,st2 |   1 | NULL | [0]  |     0 | "st1" | [        |
|         |     |      |      |       |       |   "st1", |
|         |     |      |      |       |       |   "st2"  |
|         |     |      |      |       |       | ]        |
| st1,st2 |   1 | NULL | [1]  |     1 | "st2" | [        |
|         |     |      |      |       |       |   "st1", |
|         |     |      |      |       |       |   "st2"  |
|         |     |      |      |       |       | ]        |
+---------+-----+------+------+-------+-------+----------+
2 Row(s) produced. Time Elapsed: 1.357s

推荐阅读