apache-spark - 如何在火花提交中获取驱动程序 ID
问题描述
Spark集群信息:
- 火花版本:2.2.0
- 集群包含一个带有 2 个工作节点的主节点
- 集群管理器类型:独立
我从其中一名工作人员那里提交了一个 jar 来激发集群,并且我想从提交中接收驱动程序 ID,以便我可以使用该 ID 进行以后的应用程序状态检查。问题是我在控制台中没有得到任何输出。我使用端口6066
提交并将部署模式设置为cluster
.
通过运行
spark-submit --deploy-mode cluster --supervise --class "path/to/class" --master "spark://spark-master-headless:6066" path/to/app.jar
在火花日志文件中,我可以看到下面提交的 json 响应,这正是我想要的:
[INFO] 2018-07-18 12:48:40,030 org.apache.spark.deploy.rest.RestSubmissionClient logInfo - Submitting a request to launch an application in spark://spark-master-headless:6066.
[INFO] 2018-07-18 12:48:41,074 org.apache.spark.deploy.rest.RestSubmissionClient logInfo - Submission successfully created as driver-20180718124840-0023. Polling submission state...
[INFO] 2018-07-18 12:48:41,077 org.apache.spark.deploy.rest.RestSubmissionClient logInfo - Submitting a request for the status of submission driver-20180718124840-0023 in spark://spark-master-headless:6066.
[INFO] 2018-07-18 12:48:41,092 org.apache.spark.deploy.rest.RestSubmissionClient logInfo - State of driver driver-20180718124840-0023 is now RUNNING.
[INFO] 2018-07-18 12:48:41,093 org.apache.spark.deploy.rest.RestSubmissionClient logInfo - Driver is running on worker worker-20180707104934-<some-ip-was-here>-7078 at <some-ip-was-here>:7078.
[INFO] 2018-07-18 12:48:41,114 org.apache.spark.deploy.rest.RestSubmissionClient logInfo - Server responded with CreateSubmissionResponse:
{
"action" : "CreateSubmissionResponse",
"message" : "Driver successfully submitted as driver-20180718124840-0023",
"serverSparkVersion" : "2.2.0",
"submissionId" : "driver-20180718124840-0023",
"success" : true
}
[INFO] 2018-07-18 12:48:46,572 org.apache.spark.executor.CoarseGrainedExecutorBackend initDaemon - Started daemon with process name: 31605@spark-worker-662224983-4qpfw
[INFO] 2018-07-18 12:48:46,580 org.apache.spark.util.SignalUtils logInfo - Registered signal handler for TERM
[INFO] 2018-07-18 12:48:46,583 org.apache.spark.util.SignalUtils logInfo - Registered signal handler for HUP
[INFO] 2018-07-18 12:48:46,583 org.apache.spark.util.SignalUtils logInfo - Registered signal handler for INT
[WARN] 2018-07-18 12:48:47,293 org.apache.hadoop.util.NativeCodeLoader <clinit> - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[INFO] 2018-07-18 12:48:47,607 org.apache.spark.SecurityManager logInfo - Changing view acls to: root
[INFO] 2018-07-18 12:48:47,608 org.apache.spark.SecurityManager logInfo - Changing modify acls to: root
...
但我想在控制台中有这些信息,以便我可以将它定向到一个单独的文件而不是火花日志。我假设在运行上述命令时应该打印一些消息。我什--verbose
至在命令中使用了模式,这样可能会有所帮助,但控制台中的输出仍然是空的。打印到控制台的唯一内容是
Running Spark using the REST application submission protocol.
在此页面的问题部分中,用户可以看到更多内容。
我什至尝试在我的应用程序代码中更改 Logger 级别,但这也无济于事。(基于这里的一些想法)
那么问题是,为什么我在控制台中没有得到任何输出,我该怎么做才能获得我想要打印到控制台的信息?PS我已经开发并调整了集群和jar文件到一个很好的数量,也许我在某个地方有一些东西导致输出没有被打印出来。我可以检查哪些可能的地方来解决这个问题。
更新:
我发现spark的默认log4j.properties已经被编辑了。以下是内容:
# Set everything to be logged to the console
log4j.rootCategory=INFO, RollingAppender
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=/var/log/spark.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n
# Set the default spark-shell log level to WARN. When running the spark-shell, the
# log level for this class is used to overwrite the root logger's log level, so that
# the user can have different defaults for the shell and regular Spark apps.
log4j.logger.org.apache.spark.repl.Main=INFO
# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark_project.jetty=INFO
log4j.logger.org.spark_project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
# SPARK-9183: Settings to avoid annoying messages when looking up nonexistent UDFs in SparkSQL with Hive support
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
我认为这不会让详细命令起作用。我怎样才能改变它以获得一些内容--verbose
?
解决方案
当您在集群模式下运行作业时,驱动程序可以是集群中的任何节点,因此无论您做什么打印/控制台重定向都可能不会返回到打开控制台的客户端/边缘节点/工作节点。
尝试在客户端模式下提交应用程序
推荐阅读
- tensorflow - 使用 Google 的 DEEPLAB V3+ 获取图像分割中每个语义类的类别概率分数
- ibm-mq - 卸载 WebSphere MFT
- ios - 无法同时满足动画约束
- c# - 如何获取 UI 元素的屏幕位置?
- c# - Azure Web 作业在部署后无法运行
- python - 交换数据框列数据而不更改表的索引
- c++ - OpenCV solvePnP 方法返回 NaN 值
- airflow - Airflow 通过 statsd 发送指标,但不是全部
- python - 使用 Python 监控实时 PLC 数据
- javascript - 将表单转换为 Javascript 可以读取的格式