首页 > 解决方案 > Spark Scala,获取 Epoch 毫秒

问题描述

我正在尝试使用包含单元测试用例的当前日期时间纪元毫秒的列创建一个 dataFrame。使用 unix_timestamp 函数获取纪元秒数没有问题,但是如果我尝试将纪元秒数乘以 1000 以获得毫秒数,那么我会得到一个看似随机的数字,通常是纪元秒数的 ~.87 倍数。

import spark.implicits._
import java.time.ZonedDateTime
import java.time.temporal.ChronoUnit
import java.time.format.DateTimeFormatter

val testDataDate = ZonedDateTime.now()
val unixDateTimeFormatString = "yyyy-MM-dd HH:mm:ss"
val unixDateTimeFormat = DateTimeFormatter.ofPattern(unixDateTimeFormatString)
val currentDateTimeHourString = unixDateTimeFormat.format(testDataDate)

val DATE_TIME = "date_time"
val DATE_TIME_EPOCH_SECONDS = "date_time_epoch_seconds"
val DATE_TIME_EPOCH_MILISECONDS = "date_time_epoch_miliseconds"

val testDf = Seq(
  (currentDateTimeHourString)
  ).toDF(DATE_TIME)

val testDf2 = testDf
    .withColumn(DATE_TIME_EPOCH_SECONDS,(unix_timestamp(lit(currentDateTimeHourString))).cast("int"))
    .withColumn(DATE_TIME_EPOCH_MILISECONDS,col(DATE_TIME_EPOCH_SECONDS)*lit(1000))
    
testDf2.printSchema()
testDf2.show(1, false)

结果:

root
 |-- date_time: string (nullable = true)
 |-- date_time_epoch_seconds: integer (nullable = true)
 |-- date_time_epoch_miliseconds: integer (nullable = true)
+-------------------+-----------------------+---------------------------+
|date_time          |date_time_epoch_seconds|date_time_epoch_miliseconds|
+-------------------+-----------------------+---------------------------+
|2021-08-17 16:57:34|1629219454             |1426848816                 |
+-------------------+-----------------------+---------------------------+

有谁知道如何获得纪元毫秒,或者为什么这个 *lit(1000) 乘数不起作用?

标签: scalaapache-spark

解决方案


推荐阅读