首页 > 解决方案 > 如何使日期的年数和周数对齐

问题描述

在尝试获取跨越多年的一系列日期的年份和周数时,我遇到了一些关于今年开始/结束的问题。

我理解它们单独运行的逻辑weeknumber和逻辑。year但是,当它们结合在一起时,在某些情况下,它们不会带来一致的结果,我想知道 Spark 中确保在给定周数的一致年份处理这些场景的最佳方法是什么,

例如,运行:

spark.sql("select year('2017-01-01') as year, weekofyear('2017-01-01') as weeknumber").show(false)

输出:

+----+----------+
|year|weeknumber|
+----+----------+
|2017|52        |
+----+----------+

但想要的输出是:

+----+----------+
|year|weeknumber|
+----+----------+
|2016|52        |
+----+----------+

并运行:

spark.sql("select year('2018-12-31') as year, weekofyear('2018-12-31') as weeknumber").show(false)

产生:

+----+----------+
|year|weeknumber|
+----+----------+
|2018|1         |
+----+----------+

但预期的是:

+----+----------+
|year|weeknumber|
+----+----------+
|2019|1         |
+----+----------+

代码在 Spark 2.4.2 上运行。

标签: scalaapache-sparkapache-spark-sql

解决方案


这种火花行为与ISO 8601定义一致。你不能改变它。但是有一个我能想到的解决方法。

可以先确定dayOfWeek,如果小于 4 则将年份加一,如果等于 4 则保持年份不变。否则将年份减一。

示例2017-01-01

sql("select case when date_format('2017-01-01', 'u') < 4 then year('2017-01-01')+1 when date_format('2017-01-01', 'u') = 4 then year('2017-01-01') else year('2017-01-01')- 1 end as year, weekofyear('2017-01-01') as weeknumber, date_format('2017-01-01', 'u') as dayOfWeek").show(false)

+----+----------+---------+
|year|weeknumber|dayOfWeek|
+----+----------+---------+
|2016|52        |7        |
+----+----------+---------+

示例2018-12-31

sql("select case when date_format('2018-12-31', 'u') < 4 then year('2018-12-31')+1 when date_format('2018-12-31', 'u') = 4 then year('2018-12-31') else year('2018-12-31')- 1 end as year, weekofyear('2018-12-31') as weeknumber, date_format('2018-12-31', 'u') as dayOfWeek").show(false)

+----+----------+---------+
|year|weeknumber|dayOfWeek|
+----+----------+---------+
|2019|1         |1        |
+----+----------+---------+

推荐阅读