首页 > 解决方案 > 如何在 Spark-Sql 中以 'MM/dd/yyyy HH:mm:ss.SSS' 格式解析微秒

问题描述

 spark.sql("select case when trim(map('OPT OUT',1,'OPT IN',0,'',0)[coalesce(upper(program_1),'')]) == trim(num_fg) then trim(from_unixtime(unix_timestamp(upd_dt,'MM/dd/yyyy HH:mm:ss.SSS'), 'yyyy-MM-dd HH:mm:ss.sss')) else Now() end as upd_dt from input").show(false)

输入

val sample = Seq(("OPT OUT","1","07/21/2020 09:09:09.382")).toDF("program_1","num_fg", "upd_dt")

在上面的书面查询中,微秒'sss'没有返回我们给出的输入。

如果输入是07/21/2020 09:09:09.382它正在返回07/21/2020 09:09:09.009,但预期的结果是07/21/2020 09:09:09.382 [无论我们在输入中给出什么微秒,它都应该在输出中显示]。

标签: sqldataframeapache-sparkapache-spark-sql

解决方案


这将有助于

 import java.sql.Timestamp
 import java.time.LocalDateTime
 import java.time.format.DateTimeFormatter

 def dateWithMilliSeconds (rawDate:String) = {
  val format = DateTimeFormatter.ofPattern("MM/dd/yyyy HH:mm:ss.SSS")
  Timestamp.valueOf(LocalDateTime.parse(rawDate.trim,format))
 }

spark.udf.register("dateWithMilliSeconds",dateWithMilliSeconds _)

spark.sql("select 
             case when trim(map('OPT OUT',1,'OPT IN',0,'',0 [coalesce(upper(program_1),'')]) == trim(num_fg) 
             then dateWithMilliSeconds(upd_dt) 
             else Now() end as upd_dt 
           from input
           ").show(false)

OUTPUT:
+-----------------------+
|upd_dt                 |
+-----------------------+
|2020-07-21 09:09:09.382|
+-----------------------+

推荐阅读