首页 > 解决方案 > 在 Snowflake 中获取相当于 MONTHS_BETWEEN() 的 Impala

问题描述

将查询从 Impala 迁移到 Snowflake 时遇到问题:

黑斑羚

SELECT period
    , now() as dt_today
    , MONTHS_BETWEEN(now(), period) as mb
FROM my_table

产量

period                  dt_today                        mb
--------------------------------------------------------------------------
2018-10-30T21:43:57Z    2020-02-21 10:21:12.827383000   15.709677419354838

雪花

SELECT period
    , CURRENT_TIMESTAMP() as dt_today
    , DATEDIFF('month', CURRENT_TIMESTAMP(), period) as mb
FROM my_table

产量

period                  dt_today                        mb
--------------------------------------------------------------------------
2018-10-30T21:43:57Z    2020-02-21 10:21:12.827383000   16

现在,从 Snowflake文档中我了解到,在指定monthDATEDIFF,Snowflake 只会“使用从日期开始的月份和年份”,这意味着差异不如 Impala 准确。 我已经尝试实现一个代理,比如取月差,然后应用一些计算来获得浮点部分,但我仍然得到错误的月数:

DATEDIFF('month', period, CURRENT_TIMESTAMP()) + (GREATEST(DAY(period), DAY(CURRENT_TIMESTAMP())) -  LEAST(DAY(period), DAY(CURRENT_TIMESTAMP()))) / 31

以及以下内容以获得更精确的信息,但仍然不正确:

DATEDIFF('day', period, CURRENT_TIMESTAMP())/31 + (GREATEST(DAY(period), DAY(CURRENT_TIMESTAMP())) -  LEAST(DAY(period), DAY(CURRENT_TIMESTAMP()))) / 31

问题:我怎样才能MONTHS_BETWEEN()在 Snowflake 中准确复制 Impala?

标签: datediffsnowflake-cloud-data-platform

解决方案


TL;博士

IFF(DAY(DATE1) >= DAY(DATE2), DATEDIFF('month', DATE2, DATE1), DATEDIFF('month', DATE2, DATE1) - 1)
+
IFF(DAY(DATE1) >= DAY(DATE2), (GREATEST(DAY(DATE1), DAY(DATE2)) - LEAST(DAY(DATE1), day(DATE2))) / 31, 1 - (GREATEST(DAY(DATE1), DAY(DATE2)) - LEAST(DAY(DATE1), DAY(DATE2))) / 31)

ImpalaMONTHS_BETWEEN(DATE1, DATE2)函数的工作原理如下:

  • MONTHS_BETWEEN('2019-04-13', '2019-02-10')收益率2.09672整月 + 3/31=0967
  • MONTHS_BETWEEN('2019-04-13', '2019-02-03')收益率1.77411整月 + 1-(7/31)=0967

现在我们知道 SnowflakeDATEDIFF(DATE1, DATE3)应用了一个简单的逐月操作:

  • DATEDIFF('month', '2019-04-13', '2019-02-10')产量2( 04 - 02)
  • DATEDIFF('month', '2019-04-13', '2019-02-03')产量2( 04 - 02)

为了MONTHS_BETWEEN使用 Snowflake 函数获得 Impala 的整数部分,我们应用以下逻辑:

IFF(DAY(DATE1) >= DAY(DATE2), DATEDIFF('month', DATE2, DATE1), DATEDIFF('month', DATE2, DATE1) - 1)

为了MONTHS_BETWEEN使用 Snowflake 函数获得 Impala 的小数部分,我们应用以下逻辑:

IFF(DAY(DATE1) >= DAY(DATE2), (GREATEST(DAY(DATE1), DAY(DATE2)) - LEAST(DAY(DATE1), day(DATE2))) / 31, 1 - (GREATEST(DAY(DATE1), DAY(DATE2)) - LEAST(DAY(DATE1), DAY(DATE2))) / 31)

我们只需将它们加在一起即可得到 Impala 的确切值:

IFF(DAY(DATE1) >= DAY(DATE2), DATEDIFF('month', DATE2, DATE1), DATEDIFF('month', DATE2, DATE1) - 1)
+
IFF(DAY(DATE1) >= DAY(DATE2), (GREATEST(DAY(DATE1), DAY(DATE2)) - LEAST(DAY(DATE1), day(DATE2))) / 31, 1 - (GREATEST(DAY(DATE1), DAY(DATE2)) - LEAST(DAY(DATE1), DAY(DATE2))) / 31)

推荐阅读