首页 > 解决方案 > 从 hive 查询数据时 Spark 抛出 AnalysisException (org.apache.hadoop.hive.ql.metadata.HiveException: Invalid partition for table XXX)

问题描述

venv/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 438, in collect
[2021-07-25 19:02:38,600] INFO - Job:     port = self._jdf.collectToPython()
[2021-07-25 19:02:38,600] INFO - Job:   File "venv/lib/python3.6/site-packages/py4j/java_gateway.py", line 1143, in __call__
[2021-07-25 19:02:38,600] INFO - Job:     answer, self.gateway_client, self.target_id, self.name)
[2021-07-25 19:02:38,600] INFO - Job:   File "venv/lib/python3.6/site-packages/pyspark/sql/utils.py", line 69, in deco
[2021-07-25 19:02:38,600] INFO - Job:     raise AnalysisException(s.split(': ', 1)[1], stackTrace)
[2021-07-25 19:02:38,600] INFO - Job: pyspark.sql.utils.AnalysisException: 'org.apache.hadoop.hive.ql.metadata.HiveException: Invalid partition for table digest_scheduler;'

无论如何,它可以通过在一段时间后重新运行相同的作业来解决。但想知道是什么导致了这个问题。

标签: apache-sparkpysparkhive

解决方案


推荐阅读