apache-spark - 如何在其他集群上向纱线提交作业?
问题描述
我有一个安装了 spark 的 docker 容器,我正在尝试使用 marathon 将作业提交给其他集群上的 yarn。docker 容器具有 yarn 和 hadoop conf dir 的导出值,yarn 文件还包含 emr master ip 的正确地址,但我不确定它从哪里作为 localhost?
ENV YARN_CONF_DIR="/opt/yarn-site.xml"
ENV HADOOP_CONF_DIR="/opt/spark-2.2.0-bin-hadoop2.6"
纱线.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>xx.xxx.x.xx</value>
</property>
命令:
"cmd": "/opt/spark-2.2.0-bin-hadoop2.6/bin/spark-submit --verbose \\\n --name emr_external_mpv_streaming \\\n --deploy-mode client \\\n --master yarn\\\n --conf spark.executor.instances=4 \\\n --conf spark.executor.cores=1 \\\n --conf spark.executor.memory=1g \\\n --conf spark.driver.memory=1g \\\n --conf spark.cores.max=4 \\\n --conf spark.executorEnv.EXT_WH_HOST=$EXT_WH_HOST \\\n --conf spark.executorEnv.EXT_WH_PASSWORD=$EXT_WH_PASSWORD \\\n --conf spark.executorEnv.KAFKA_BROKER_LIST=$_KAFKA_BROKER_LIST \\\n --conf spark.executorEnv.SCHEMA_REGISTRY_URL=$SCHEMA_REGISTRY_URL \\\n --conf spark.executorEnv.AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \\\n --conf spark.executorEnv.AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \\\n --conf spark.executorEnv.STAGING_S3_BUCKET=$STAGING_S3_BUCKET \\\n --conf spark.executorEnv.KAFKA_GROUP_ID=$KAFKA_GROUP_ID \\\n --conf spark.executorEnv.MAX_RATE=$MAX_RATE \\\n --conf spark.executorEnv.KAFKA_MAX_POLL_MS=$KAFKA_MAX_POLL_MS \\\n --conf spark.executorEnv.KAFKA_MAX_POLL_RECORDS=$KAFKA_MAX_POLL_RECORDS \\\n --class com.ticketnetwork.edwstream.external.MapPageView \\\n /opt/edw-stream-external-mpv_2.11-2-SNAPSHOT.jar",
我尝试指定 --deploy-mode cluster \\n --master yarn\\n -- 同样的错误
错误:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/09/10 20:41:24 INFO SparkContext: Running Spark version 2.2.0
18/09/10 20:41:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/09/10 20:41:25 INFO SparkContext: Submitted application: edw-stream-ext-mpv-emr-prod
18/09/10 20:41:25 INFO SecurityManager: Changing view acls to: root
18/09/10 20:41:25 INFO SecurityManager: Changing modify acls to: root
18/09/10 20:41:25 INFO SecurityManager: Changing view acls groups to:
18/09/10 20:41:25 INFO SecurityManager: Changing modify acls groups to:
18/09/10 20:41:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
18/09/10 20:41:25 INFO Utils: Successfully started service 'sparkDriver' on port 35868.
18/09/10 20:41:25 INFO SparkEnv: Registering MapOutputTracker
18/09/10 20:41:25 INFO SparkEnv: Registering BlockManagerMaster
18/09/10 20:41:25 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/09/10 20:41:25 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/09/10 20:41:25 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-5526b967-2be9-44bf-a86f-79ef72f2ac0f
18/09/10 20:41:25 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
18/09/10 20:41:26 INFO SparkEnv: Registering OutputCommitCoordinator
18/09/10 20:41:26 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/09/10 20:41:26 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.150.4.45:4040
18/09/10 20:41:26 INFO SparkContext: Added JAR file:/opt/edw-stream-external-mpv_2.11-2-SNAPSHOT.jar at spark://10.150.4.45:35868/jars/edw-stream-external-mpv_2.11-2-SNAPSHOT.jar with timestamp 1536612086416
18/09/10 20:41:26 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/09/10 20:41:27 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/09/10 20:41:28 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/09/10 20:41:29 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
解决方案
0.0.0.0
是默认主机名属性,而 8032 是默认端口号。
您获得默认值的一个原因是没有正确设置 Hadoop 环境变量。您HADOOP_CONF_DIR
需要是 Spark(或 Hadoop)的conf
文件夹,而不是 Spark 提取的基本文件夹。此目录必须包含core-site.xml
, yarn-site.xml
, hdfs-site.xml
,hive-site.xml
如果使用 HiveContext
然后如果 yarn-site.xml 在上面的位置,你就不需要了YARN_CONF_DIR
,但是如果你设置了,它需要是一个实际的目录,而不是直接指向文件。
此外,您可能需要设置多个主机名。例如,生产级 YARN 集群将具有两个用于容错的 ResourceManager。此外,如果您启用了某些 Kerberos 密钥表和主体,则可能需要设置。
但是,如果您已经拥有 Mesos/Marathon,我不确定您为什么要使用 YARN
推荐阅读
- c++ - 共享队列的线程安全
- php - 尝试从数据库表行在 Wordpress 中创建帖子
- regex - Grep - 如何找到至少 3 倍于特定数字的行
- html - 如何在循环外获取值 - powershell
- html - 如何在 100% 宽度的导航栏上居中对齐我的链接?
- excel - 索引匹配 - 从表格列中删除空白
- c++ - C++ Eigen index-wise 操作(相当于 R tapply)
- r - 在 R 中创建一个范围
- c# - 如何使用 WebApplicationFactory 从 xUnit 集成测试中设置我的 IdentityServer4 BackChannelHandler?
- vba - Outlook VBA - 在电子邮件中打开链接时是否触发了事件?