首页 > 解决方案 > cloudera 中的 Hbase-Spark 连接器问题:java.lang.AbstractMethodError

问题描述

我正在尝试将 Spark 数据帧写入 Hbase,但是当我对同一数据帧执行任何操作或写入/保存方法时,它会出现以下异常:

{
java.lang.AbstractMethodError
        at org.apache.spark.Logging$class.log(Logging.scala:50)
        at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.log(HBaseFilter.scala:121)
        at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.buildFilters(HBaseFilter.scala:124)
        at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
----------

    Here is my code:




    import org.apache.spark.sql.{SQLContext, _}
    import org.apache.spark.sql.execution.datasources.hbase._
    import org.apache.spark.{SparkConf, SparkContext}

    def catalog = s"""{
         |      |"table":{"namespace":"default", "name":"Contacts"},
         |      |"rowkey":"key",
         |      |"columns":{
         |      |"rowkey":{"cf":"rowkey", "col":"key", "type":"string"},
         |      |"officeAddress":{"cf":"Office", "col":"Address", "type":"string"},
         |      |"officePhone":{"cf":"Office", "col":"Phone", "type":"string"},
         |      |"personalName":{"cf":"Personal", "col":"Name", "type":"string"},
         |      |"personalPhone":{"cf":"Personal", "col":"Phone", "type":"string"}
         |      |}
         |  |}""".stripMargin

    def withCatalog(cat: String): DataFrame = {
         |          spark.sqlContext
         |          .read
         |          .options(Map(HBaseTableCatalog.tableCatalog->cat))
         |          .format("org.apache.spark.sql.execution.datasources.hbase")
         |          .load()
         |      }

    val df = withCatalog(catalog)

    i was able to create dataframe, but i perform 
    df.show() 

它给了我错误:

    java.lang.AbstractMethodError
     at org.apache.spark.Logging$class.log(Logging.scala:50)
            at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.log(HBaseFilter.scala:121)
            at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.buildFilters(HBaseFilter.scala:124)
            at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60)`

请给出一些建议:我正在从 Hbase 导入表并创建 catlog,并基于该创建数据框,使用:- Spark 1.6 HBase 1.2.0-cdh5.13.3 cloudera

标签: apache-spark-sql

解决方案


遇到同样的问题,我使用的是 hbase-spark 1.2.0-cdh5.8.4。

我尝试在 1.2.0-cdh5.13.0 版本上编译它,之后错误不再存在。您应该尝试重新编译源代码或使用更高版本。


推荐阅读