首页 > 解决方案 > 过滤器表达式中的 Hive 查询子查询

问题描述

我正在使用 spark sql 创建查询。实际查询的大小有点大,但我遇到的问题是以下错误

Error in SQL statement: AnalysisException: IN/EXISTS predicate sub-queries can only be used in a Filter: Aggregate

查询的以下部分引起了麻烦。我不知道

select col1, col2,
sum(case
    WHEN snoozed_until is NULL
        AND hired is NULL
        AND lower(profile_archive_status) = 'true'
        AND profile_id NOT IN 
    (SELECT profile_id
    FROM candidates_feedback f
    WHERE lower(f.a1) LIKE '%no%') THEN
        1
        ELSE 0 END) Archived, 
        sum(case
        WHEN snoozed_until is NULL
            AND hired is NULL
            AND lower(profile_archive_status) = 'true'
            AND profile_id IN (SELECT profile_id FROM candidates_feedback f WHERE lower(f.a1) LIKE '%no%') THEN 1 ELSE 0 END) Rejected,
from table;

标签: sqlapache-sparkhiveapache-spark-sql

解决方案


将左连接与子查询一起使用:

select col1, col2,
sum(CASE
       WHEN snoozed_until is NULL
        AND hired is NULL
        AND lower(profile_archive_status) = 'true'
        AND p.profile_id IS NULL THEN 1 ELSE 0 
     END) Archived, 
sum(CASE
        WHEN snoozed_until is NULL
            AND hired is NULL
            AND lower(profile_archive_status) = 'true'
            AND p.profile_id IS NOT NULL THEN 1 ELSE 0 
    END) Rejected
from table t
     left join 
     (SELECT DISTINCT profile_id
        FROM candidates_feedback f
       WHERE lower(f.a1) LIKE '%no%') p on t.profile_id=p.profile_id

推荐阅读