首页 > 解决方案 > 提交作业时,spark-submit 中使用的参数是否有特定顺序?

问题描述

我正在尝试使用 spark-submit 提交火花作业,如下所示:

> SPARK_MAJOR_VERSION=2  spark-submit --conf spark.ui.port=4090
> --driver-class-path /home/devusr/jars/greenplum-spark_2.11-1.3.0.jar  --jars /home/devusr/jars/greenplum-spark_2.11-1.3.0.jar --executor-cores 3 --executor-memory 13G --class com.partition.source.YearPartition splinter_2.11-0.1.jar --master=yarn
> --keytab /home/devusr/devusr.keytab --principal devusr@DEV.COM --files /usr/hdp/current/spark2-client/conf/hive-site.xml,testconnection.properties
> --name Splinter --conf spark.executor.extraClassPath=/home/devusr/jars/greenplum-spark_2.11-1.3.0.jar
> --conf spark.executor.instances=10 --conf spark.dynamicAllocation.enabled=false  --conf
> spark.files.maxPartitionBytes=256M

但是作业没有运行,而是打印:

SPARK_MAJOR_VERSION is set to 2, using Spark2 

谁能告诉我 spark-submit 中使用的参数是否有任何特定顺序?

标签: scalaapache-spark

解决方案


spark-submitcluster模式下yarn使用 的格式$ ./bin/spark-submit --class path.to.your.Class --master yarn --deploy-mode cluster [options] <app jar> [app options]记录在https://spark.apache.org/docs/2.1.0/running-on-yarn.html

如果splinter_2.11-0.1.jar是包含你的类的 jar,你com.partition.source.YearPartition可以尝试使用这个:

spark-submit \
        --class com.partition.source.YearPartition                                              \
        --master=yarn                                                                           \
        --conf spark.ui.port=4090                                                               \
        --driver-class-path /home/devusr/jars/greenplum-spark_2.11-1.3.0.jar                    \
        --jars /home/devusr/jars/greenplum-spark_2.11-1.3.0.jar                                 \
        --executor-cores 3                                                                      \
        --executor-memory 13G                                                                   \
        --keytab /home/devusr/devusr.keytab                                                     \
        --principal devusr@DEV.COM                                                              \
        --files /usr/hdp/current/spark2-client/conf/hive-site.xml,testconnection.properties     \
        --name Splinter                                                                         \
        --conf spark.executor.extraClassPath=/home/devusr/jars/greenplum-spark_2.11-1.3.0.jar   \
        --conf spark.executor.instances=10                                                      \
        --conf spark.dynamicAllocation.enabled=false                                            \
        --conf spark.files.maxPartitionBytes=256M                                               \
        splinter_2.11-0.1.jar

推荐阅读