首页 > 解决方案 > 输入雪花中缺失类别的数据

问题描述

我有一张像

在此处输入图像描述

对于每个关键字,有 2 种设备 - 移动设备和桌面设备。如果只找到一个设备的条目,那么它应该自动为其他设备创建条目,保持其余列中的数据相同。我目前正在做一个完整的外部连接,它适用于缺少一个设备类别但在两个设备都存在的情况下生成重复项的情况。例如,我当前的查询给出的结果为

select a.keyword, b.device, a.rating 
from kw a full outer join kw b 
on a.keyword=b.keyword and a.rating=b.rating 

在此处输入图像描述

我如何得到结果

在此处输入图像描述

标签: sqlsnowflake-cloud-data-platform

解决方案


The first step will be to identify records that don't have a paired record. There's a couple of ways to do this, but the easiest is probably just a quick GROUP BY/HAVING:

  SELECT keyword
  FROM kw
  GROUP BY keyword
  HAVING COUNT(*) = 1

You can those join those results back into the original table to generate the new records that are needed:

SELECT sk.keyword, 
    CASE WHEN kw.device = 'mobile' THEN 'desktop' ELSE 'mobile' END as device,
    kw.rating
FROM 
    (
      SELECT keyword
      FROM kw
      GROUP BY keyword
      HAVING COUNT(*) = 1
    )sk
    INNER JOIN kw ON kw.keyword = sk.keyword

Then you can UNION back in the original table to bring your new records and existing records into a single result set:

SELECT sk.keyword, 
    CASE WHEN kw.device = 'mobile' THEN 'desktop' ELSE 'mobile' END as device,
    kw.rating
FROM 
    (
      SELECT keyword
      FROM kw
      GROUP BY keyword
      HAVING COUNT(*) = 1
    )sk
    INNER JOIN kw ON kw.keyword = sk.keyword
UNION ALL
SELECT * FROM kw;

As another option that will scale if you add in more 'devices' is to cross join all the potential device/keyword combinations and then left join to your original table:

SELECT
    fe.keyword,
    fe.device,
    CASE WHEN kw.rating IS NULL THEN max(rating) OVER (PARTITION BY fe.keyword) ELSE kw.rating END AS rating
FROM 
    (
      SELECT DISTINCT kw.keyword, kw2.device
      FROM kw, kw kw2
    ) fe
    LEFT OUTER JOIN kw ON kw.keyword = fe.keyword
        AND kw.device = fe.device;

推荐阅读