首页 > 解决方案 > BigQuery SQL 'NOT IN' 不排除结果

问题描述

我正在尝试使用 过滤掉 BigQuery 查询中的一些记录WHERE NOT IN,但是,无论是否使用此代码,我都会得到相同数量的结果,并且仍然包含我所需的排除项。

这是我的查询:

SELECT 
    EmailHash,
    DATE(MAX(OrderDate)) AS max_ord_date
    FROM `project.dataset.Header`
    WHERE Date(OrderDate) = '2019-05-15'
    # AND mb_company = 'CORE'
    AND CancelledFlag IS NOT True
    AND AmountPaid > 0
    # Exclude these order numbers
    AND ordernumber NOT IN (
        SELECT ordernumber FROM(
            SELECT 
            ordernumber,
            SUM(SAFE_CAST(StockOutQuantity AS INT64)) AS so,
            SUM(QuantityCancelled) AS cx,
            SUM(Quantity) AS ord
            FROM `project.dataset.Detail`
            WHERE Date(OrderDate) = '2019-05-15'
            GROUP BY ordernumber
            HAVING so = ord)
            )
    # Exclude these order numbers
    AND ordernumber NOT IN (
        SELECT ordernumber FROM(
            SELECT 
            ordernumber,
            SUM(AmountPaid) ap,
            ROUND(SUM(AmountPaid) - (SUM(PnPOrder) + SUM(PnPOrderTax)),2) ap_ntx,
            SUM(RefundedValue) rv,
            SUM(GoodsNet) gn
            FROM `project.dataset.Header`
            WHERE Date(OrderDate) = '2019-05-15'
            GROUP BY ordernumber
            HAVING (ap = rv) OR (ap_ntx = rv) OR (gn = rv)
            )
    )
    GROUP BY EmailHash

运行此查询将返回 10802 个结果。

当我运行下面的 SQL 时,我看到 17 个订单应该从总结果中排除。

SELECT ordernumber FROM(
                SELECT 
                ordernumber,
                SUM(SAFE_CAST(StockOutQuantity AS INT64)) AS so,
                SUM(QuantityCancelled) AS cx,
                SUM(Quantity) AS ord
                FROM `project.dataset.Detail`
                WHERE Date(OrderDate) = '2019-05-15'
                GROUP BY ordernumber
                HAVING so = ord)

我的第二个 NO 子句应该从总数中再删除 293 个订单,但这些订单号仍然存在。如果我注释掉这两个 WHERE 子句,我会得到相同数量的结果,所以很明显它们返回的订单没有被排除,我的代码中缺少什么来排除这些记录?

标签: sqlgoogle-bigquery

解决方案


您是否正在寻找订单号的队列之间的 NOT IN the UNION

SELECT 
    EmailHash,
    DATE(MAX(OrderDate)) AS max_ord_date
    FROM `project.dataset.Header`
    WHERE Date(OrderDate) = '2019-05-15'
    # AND mb_company = 'CORE'
    AND CancelledFlag IS NOT True
    AND AmountPaid > 0
    # Exclude these order numbers
    AND ordernumber NOT IN (
        SELECT ordernumber FROM(
            SELECT 
            ordernumber,
            SUM(SAFE_CAST(StockOutQuantity AS INT64)) AS so,
            SUM(QuantityCancelled) AS cx,
            SUM(Quantity) AS ord
            FROM `project.dataset.Detail`
            WHERE Date(OrderDate) = '2019-05-15'
            GROUP BY ordernumber
            HAVING so = ord)
        UNION 
        SELECT ordernumber FROM(
            SELECT 
            ordernumber,
            SUM(AmountPaid) ap,
            ROUND(SUM(AmountPaid) - (SUM(PnPOrder) + SUM(PnPOrderTax)),2) ap_ntx,
            SUM(RefundedValue) rv,
            SUM(GoodsNet) gn
            FROM `project.dataset.Header`
            WHERE Date(OrderDate) = '2019-05-15'
            GROUP BY ordernumber
            HAVING (ap = rv) OR (ap_ntx = rv) OR (gn = rv)
            )
    )
    GROUP BY EmailHash

推荐阅读