apache-spark - 外部 Hive 元存储的 Spark-SQL 错误
问题描述
所以我工作的公司有一个内部元数据系统来管理存储在 S3 上的元数据(数据库、表、分区等)。他们最近在系统顶部添加了一个 API 层,以便可以将其用作外部 Hive 肉店。
当我启动 spark-sql 会话并指向服务端点时,我可以看到所有数据库和表。然后,当我查询表时,出现以下错误:
spark-sql> select * from insertions limit 1;
2018-10-22 15:20:22 ERROR SparkSQLDriver:91 - Failed in [select * from insertions limit 1]
java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:156)
at org.apache.spark.sql.catalyst.catalog.UnresolvedCatalogRelation.<init>(interface.scala:426)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupRelation(SessionCatalog.scala:681)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:662)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.resolveRelation(Analyzer.scala:617)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:647)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:640)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
我怀疑这是因为某些 Hive Metastore API 没有实现。
有人知道这里可能缺少什么吗?
谢谢
解决方案
推荐阅读
- c++ - 将 OpenCV 图像数据获取到 C# 的问题
- python - 有人可以帮我解决这个输入错误吗?
- python - 为什么 hexdump 在我的 python 输出开头显示 0a?
- javascript - 发送 1927 年之前的日期时,邮寄请求中的奇怪日期
- pandas - Pandas - 使用另一列的 N 行降序获取一列的平均值
- python - eval() 的这种用法安全吗?
- haskell - 我如何从 ghci 取消设置:set -v?
- r - R将因子变量的级别堆叠到数据框中
- c++ - 在测试标志上有条件地声明一个函数 noexcept ?
- php - 幻灯片 jquery 在加载时显示所有图像