首页 > 解决方案 > 分别配置 Spark 和 Hadoop(haddop 和 spark 使用哪个版本)

问题描述

我正在尝试使用 hadoop 3.1.2 配置 spark 2.4.4,我已经成功安装了 hadoop-3.1.2.tar.gz 和 spark-2.4.4-bin-without-hadoop.tgz,并且我已经构建了 conf/spark- env.sh 文件

export JAVA_HOME=/opt/jdk8u202-b08
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_HOME=/usr/local/spark
export SPARK_DIST_CLASSPATH=$HADOOP_HOME/etc/hadoop
export SPARK_DIST_CLASSPATH=($HADOOP_HOME/bin/hadoop classpath)

但是当我启动 spark-shell

2019-11-27 11:53:07,051 WARN util.Utils: Your hostname, xxxxxxx resolves to a loopback address: 127.0.1.1; using 172.20.20.145 instead (on interface wlp2s0)
2019-11-27 11:53:07,052 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
2019-11-27 11:53:07,327 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://ashish-mittal:4040
Spark context available as 'sc' (master = local[*], app id = local-1574835792826).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.4.4
      /_/

Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_202)
Type in expressions to have them evaluated.
Type :help for more information.

scala> 

如何检查使用了哪个版本的hadoop with spark

标签: apache-sparkhadoop

解决方案


Spark 使用 HADOOP_HOME 并从那里加载类路径,因此您下载的版本就是它将使用的版本

请注意,Spark 尚未完全支持 Hadoop3


推荐阅读