scala - Spark Read HBase with java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.InputSplit.getLocationInfo 错误
问题描述
我想通过 Spark 使用 scala 读取 Hbase,但出现错误:
Exception in thread "dag-scheduler-event-loop" java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.InputSplit.getLocationInfo()[Lorg/apache/hadoop/mapred/SplitLocationInfo;
但是我已经添加了依赖项,这个问题困扰着我。我的环境如下:
- 斯卡拉:2.11.12
- 火花:2.3.1
- HBase:可能是 2.1.0(我不知道)
- Hadoop:2.7.2.4
而我build.sbt
的是:</p>
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.3.1",
"org.apache.spark" % "spark-sql_2.11" % "2.3.1",
"org.apache.spark" % "spark-streaming_2.11" % "2.3.1",
"org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.3.1",
"org.apache.spark" % "spark-yarn_2.11" % "2.3.1",
"org.apache.hadoop" % "hadoop-core" % "2.6.0-mr1-cdh5.15.1",
"org.apache.hadoop" % "hadoop-common" % "2.7.2",
"org.apache.hadoop" % "hadoop-client" % "2.7.2",
"org.apache.hadoop" % "hadoop-mapred" % "0.22.0",
"org.apache.hadoop" % "hadoop-nfs" % "2.7.2",
"org.apache.hadoop" % "hadoop-hdfs" % "2.7.2",
"org.apache.hadoop" % "hadoop-hdfs-nfs" % "2.7.2",
"org.apache.hadoop" % "hadoop-mapreduce-client-core" % "2.7.2",
"org.apache.hadoop" % "hadoop-mapreduce" % "2.7.2",
"org.apache.hadoop" % "hadoop-mapreduce-client" % "2.7.2",
"org.apache.hadoop" % "hadoop-mapreduce-client-common" % "2.7.2",
"org.apache.hbase" % "hbase" % "2.1.0",
"org.apache.hbase" % "hbase-server" % "2.1.0",
"org.apache.hbase" % "hbase-common" % "2.1.0",
"org.apache.hbase" % "hbase-client" % "2.1.0",
"org.apache.hbase" % "hbase-protocol" % "2.1.0",
"org.apache.hbase" % "hbase-metrics" % "2.1.0",
"org.apache.hbase" % "hbase-metrics-api" % "2.1.0",
"org.apache.hbase" % "hbase-mapreduce" % "2.1.0",
"org.apache.hbase" % "hbase-zookeeper" % "2.1.0",
"org.apache.hbase" % "hbase-hadoop-compat" % "2.1.0",
"org.apache.hbase" % "hbase-hadoop2-compat" % "2.1.0",
"org.apache.hbase" % "hbase-spark" % "2.1.0-cdh6.1.0"
)
我真的不知道我哪里错了,如果我添加了错误的依赖或者我需要添加一些新的依赖,请告诉我在哪里可以下载它,例如:resolvers += "Apache HBase" at "https://repository.apache.org/content/repositories/releases"
请帮助我,谢谢!
解决方案
您需要修复这些版本以匹配您正在运行的 Hadoop 版本,否则您可能会遇到类路径/方法问题。具体来说,您的错误来自 mapreduce 包
"org.apache.hadoop" % "hadoop-core" % "2.6.0-mr1-cdh5.15.1",
"org.apache.hadoop" % "hadoop-mapred" % "0.22.0",
Spark本身已经包含了大多数 Hadoop,所以不清楚为什么你要自己指定它们,但至少要加上% "provided"
其中的一些
对于hbase-spark
,我怀疑您是否需要cdh6
依赖项,因为 CDH 6 基于 Hadoop 3 库,而不是 2.7.2
推荐阅读
- c# - 重新启动我的代码的一部分?SharpPCap 的问题
- reactjs - 在 Redux 中存储简单状态更改的位置
- asp.net-core-2.0 - .Net Reactor 和 ASP.NET Core
- iis - ASP.NET Core 2.0 - IIS 的 HTTP 错误 502.5 进程失败
- javascript - 允许用户更改背景颜色
- vb.net - 从字符串“”到 Long 类型的转换无效...如果订单号错误,我只尝试了数字而不是字符
- excel - Excel VBA一个接一个地粘贴行
- java - Spring Data JPA 难以理解的工作逻辑
- javascript - Angular 1.x:序列化对象的方法
- c# - 如何使用 Npgsql - 带有 IP 地址和外部端口的 NpgsqlConnection