apache-spark - 无法针对 hadoop 3.2.0 构建 spark2.4.3
问题描述
我正在构建spark 2.4.3
使其与最新hadoop 3.2.0
的 .
源代码从https://www.apache.org/dyn/closer.lua/spark/spark-2.4.3/spark-2.4.3.tgz下载
构建命令是./build/mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.0 -DskipTests clean package
构建结果是:
[INFO] Spark Project Parent POM ........................... SUCCESS [ 1.761 s]
[INFO] Spark Project Tags ................................. SUCCESS [ 1.221 s]
[INFO] Spark Project Sketch ............................... SUCCESS [ 0.551 s]
[INFO] Spark Project Local DB ............................. SUCCESS [ 0.608 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 1.558 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 0.631 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 0.444 s]
[INFO] Spark Project Launcher ............................. SUCCESS [ 2.501 s]
[INFO] Spark Project Core ................................. SUCCESS [ 13.536 s]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 0.549 s]
[INFO] Spark Project GraphX ............................... SUCCESS [ 1.614 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 3.332 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [ 14.271 s]
[INFO] Spark Project SQL .................................. SUCCESS [ 13.008 s]
[INFO] Spark Project ML Library ........................... SUCCESS [ 7.923 s]
[INFO] Spark Project Tools ................................ SUCCESS [ 0.187 s]
[INFO] Spark Project Hive ................................. SUCCESS [ 6.664 s]
[INFO] Spark Project REPL ................................. SUCCESS [ 1.285 s]
[INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 4.824 s]
[INFO] Spark Project YARN ................................. SUCCESS [ 3.020 s]
[INFO] Spark Project Assembly ............................. SUCCESS [ 1.558 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 1.411 s]
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [ 1.573 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 1.702 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 5.969 s]
[INFO] Spark Avro ......................................... SUCCESS [ 0.702 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:32 min
[INFO] Finished at: 2019-07-31T18:56:24+08:00
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hadoop-3.2" could not be activated because it does not exist.
按照我的预期,在build目录下会生成一个一体的压缩文件spark-2.4.3-bin-hadoop3.2.tgz
,就像可以从官网下载的二进制文件一样,https://www.apache.org/dyn/closer.lua/火花/火花-2.4.3/火花-2.4.3-bin-hadoop2.7.tgz。
如何删除警告The requested profile "hadoop-3.2" could not be activated because it does not exist
,这是什么意思?
解决方案
警告:如果您不知道自己在做什么,您正在尝试做的事情可能会导致非常不稳定的环境。
话虽如此,spark 2.4.x 稳定版没有 profile hadoop-3.2
,它有hadoop-3.1
.
您将需要从 master 中提取代码以实现您想要实现的目标。
如果您的唯一目的是与spark 2.4.3
兼容hadoop 3.2
,您可以查看 master 中的配置文件以及相关更改并将它们挑选到您自己的工作区中。
推荐阅读
- asp.net-core - 如何远程调试在 IIS 中发布的 ASP.NET Core webAPI
- mongodb - 在文档中存储表达式/条件
- amazon-web-services - Terraform 错误 EntityAlreadyExists:名为 iam_for_lambda 的角色已存在
- python - 如何打印一个字符串,显示字符串中每个字符重复的次数?
- javascript - 自定义 CTA 按钮插件 CK5Editor
- python - 如何持续交付部署在 Azure 中的 Python 函数应用程序?
- python - 在 Pandas Dataframe 中创建具有组中行位置的新列
- c# - AutoMapperMappingException:缺少类型映射配置或不支持的映射
- c - Activate some bits of a number in C
- r - 为什么 col_search 有时有效,有时无效,无论是否有效