首页 > 解决方案 > Spark History - Log timestamps have wrong time zone

问题描述

When I submit a job on a set of machines machine located in London timezone, the Spark Master on the dashboard has the correct time, but the dashboard of history server shows time that is 1 hrs ahead which is GMT. Is there a way to fix this in Apache Spark?

标签: apache-sparkpysparkjvmapache-spark-sql

解决方案


您的日志时间戳很可能没有“错误”的时区,但是您的 spark 集群位于 GMT,或者 conf 设置为:

spark.conf.set("spark.sql.session.timeZone", "GMT")

将此行明确更改为伦敦时区(BST?)

或者使用花哨的 from_utc_timestamp 函数,它可以让您在转换 ts 时指定时区

还要检查您的 ts 是否以毫秒为单位,并在 JVM 配置 spark.executor.extraJavaOptions 中设置 -Duser.timezone


推荐阅读