首页 > 解决方案 > 无法在 Windows 10 机器中启动“Spark 历史服务器”

问题描述

当尝试使用 Powershell 终端启动 Spark 历史服务器(从我的 SPARK_HOME/sbin)时

.\start-history-server.sh 

使用以下消息启动 Windows 终端,然后关闭。

ps: unknown option -- o
Try `ps --help' for more information.
starting org.apache.spark.deploy.history.HistoryServer, logging to C:\Spark/logs/spark--org.apache.spark.deploy.history.HistoryServer-1-<my-machine>.out
ps: unknown option -- o
Try `ps --help' for more information.
ps: unknown option -- o
Try `ps --help' for more information.
ps: unknown option -- o
Try `ps --help' for more information.
ps: unknown option -- o
Try `ps --help' for more information.

这是在 “ C:\Spark\logs ”中spark--org.apache.spark.deploy.history.HistoryServer-1-<my-machine>.out生成的输出

Spark Command: C:\Program Files (x86)\Java\jre1.8.0_161\bin\java -cp C:\Spark/conf\;C:\Spark\jars\* -Xmx1g org.apache.spark.deploy.history.HistoryServer C:\Spark\logs
========================================
"C:\Program Files (x86)\Java\jre1.8.0_161\bin\java" -cp "C:\Spark/conf\;C:\Spark\jars\*" -Xmx1g org.apache.spark.deploy.history.HistoryServer C:\Spark\logs 
C:\Spark/bin/spark-class: line 96: CMD: bad array subscript

我已经尝试过的:

更新 'spark-defaults.conf' 如下:

spark.eventLog.enabled           true
spark.eventLog.dir               file:///C:\Spark\logs
spark.history.fs.logDirectory    file:///C:\Spark\logs

[也关注这里的讨论](无法启动 spark 历史服务器)我尝试运行以下命令(来自 SPARK_HOME/sbin)

spark-class org.apache.spark.deploy.history.HistoryServer

但它会导致 FileNotFound 异常如下:(这很奇怪,因为它试图以某种方式寻找C:Sparklogs而不是C:\Spark\logs

PS C:\Spark\sbin> spark-class org.apache.spark.deploy.history.HistoryServer                                                                                                                  Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/08/26 12:18:03 INFO HistoryServer: Started daemon with process name: 24364@<my-machine>
20/08/26 12:18:03 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/08/26 12:18:03 INFO SecurityManager: Changing view acls to: <USER>
20/08/26 12:18:03 INFO SecurityManager: Changing modify acls to: <USER>
20/08/26 12:18:03 INFO SecurityManager: Changing view acls groups to:
20/08/26 12:18:03 INFO SecurityManager: Changing modify acls groups to:
20/08/26 12:18:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(<USER>); groups with view permissions: Set(); users  with modify permissions: Set(<USER>); groups with modify permissions: Set()
20/08/26 12:18:04 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions
20/08/26 12:18:05 INFO Utils: Successfully started service on port 18080.
20/08/26 12:18:05 INFO HistoryServer: Bound HistoryServer to 0.0.0.0, and started at http://my-machine:18080
Exception in thread "main" java.io.FileNotFoundException: Log directory specified does not exist: file:///C:Sparklogs
        at org.apache.spark.deploy.history.FsHistoryProvider.startPolling(FsHistoryProvider.scala:279)
        at org.apache.spark.deploy.history.FsHistoryProvider.initialize(FsHistoryProvider.scala:227)
        at org.apache.spark.deploy.history.FsHistoryProvider.start(FsHistoryProvider.scala:409)
        at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:303)
        at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.io.FileNotFoundException: File file:/C:Sparklogs does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:428)
        at org.apache.spark.deploy.history.FsHistoryProvider.startPolling(FsHistoryProvider.scala:269)
    

谁能建议我在这里可以尝试什么来解决问题并启动 Spark History 服务器?

谢谢你。

标签: windowspowershellapache-spark

解决方案


更新: 以下工作

  1. 将我的日志“ spark.eventLog.dir”和“ spark.history.fs.logDirectory”更新为:“file:///C:/Spark/eventlog”
  2. 从SPARKHOME Spark-class org.apache.spark.deploy.history.HistoryServer/sbin执行
  3. 现在可以从这里访问历史服务器 Web ui: http://localhost:18080

推荐阅读