首页 > 解决方案 > Hive:失败:在多个表/子查询中发现 SemanticException 列

问题描述

我正在尝试按如下方式运行查询

hive -e "set hive.execution.engine=mr;set hive.strict.checks.cartesian.product=false;
set hive.mapred.mode=nonstrict;use db1; select col1,col2 from tb1 where col_date='2020-08-15' and col3='Y' 
and col4='val4' and col1 not in 
( select distinct col1 from db2.tb2 where col_date='2020-08-15' and 
col5='val5' and col6='val6' and col3='Y' and col4='val4') " 

但我一直在

FAILED: SemanticException Column col1 Found in more than One Tables/Subqueries

我究竟做错了什么 ?我怎样才能解决这个问题 ?

中的列db1.tb1

col1
col2
col_date
col3
col4

中的列db2.tb2

col1
col2
col_date
col3
col4
col5
col6

标签: sqlhivehiveql

解决方案


为表添加别名并将它们用于所有列:

hive -e "set hive.execution.engine=mr;set hive.strict.checks.cartesian.product=false;
set hive.mapred.mode=nonstrict;
use db1; 
select t1.col1, t1.col2 
  from tb1 t1
 where t1.col_date='2020-08-15' and t1.col3='Y' and t1.col4='val4' 
   and t1.col1 not in 
( select distinct t2.col1 from db2.tb2 t2 
   where t2.col_date='2020-08-15' and t2.col5='val5' and t2.col6='val6' and t2.col3='Y' and t2.col4='val4' ) " 

或者,如果您的 Hive 版本不支持 NOT IN 子查询,您可以使用 LEFT JOIN + filter

hive -e "set hive.execution.engine=mr;set hive.strict.checks.cartesian.product=false;
set hive.mapred.mode=nonstrict;
use db1; 
select t1.col1, t1.col2 
  from tb1 t1
       left join 
        ( select distinct t2.col1 from db2.tb2 t2 
           where t2.col_date='2020-08-15' 
             and t2.col5='val5' 
             and t2.col6='val6' 
             and t2.col3='Y' 
             and t2.col4='val4' 
        ) s on t1.col1 = s.col1 
 where t1.col_date='2020-08-15' and t1.col3='Y' and t1.col4='val4' 
   and s.col1 is null --filter out joined records
"

推荐阅读