首页 > 解决方案 > How can I adjust this query to produce a result that shows the average on a month-by-month basis over time

问题描述

I'm having a hard time producing the desired result with one of my queries.

I'd like to be able to display the average revenue generated per user on a rolling month by month basis, based on the following criteria:

The below query is what I have now, which pulls the average revenue per user for the January cohort:

WITH bookings as (SELECT u.id as user_id, count(*) as bookings_last_90, sum(total)/100 as revenue_last_90
            FROM revenue r
            JOIN users u on r.user_id = u.id
            WHERE (CAST(r.created_at AS date) BETWEEN CAST((NOW() + INTERVAL '-90 day') AS date)
                AND CAST(now() AS date))
            GROUP BY u.id 
            HAVING COUNT(*) >= 20)
SELECT avg(b.revenue_last_90)
FROM bookings b;

I essentially need to adapt the above query to pull the average revenue per cohort user on a rolling month by month basis, keeping in tact the past 90-day timeframe for cohort definition.

标签: sqlpostgresql

解决方案


当您有一个使用一个时间戳的查询时,一般方法是:

  1. 生成要在表、视图、CTE 等中使用的日期或时间戳列表
  2. 加入时间戳列表
  3. 将您正在使用的时间戳替换为列表中的时间戳

没有架构,我无法对其进行测试,但结果可能类似于:

WITH --first generate list of dates from the created_at field in revenue
month_list as (select date_trunc('month' , r.created_at) as m from revenue r group by 1 )

--then use that in the bookings query
, bookings as (SELECT u.id as user_id, m.m as cohort_month, count(*) as bookings_last_90, sum(total)/100 as revenue_last_90
            FROM revenue r
            JOIN users u on r.user_id = u.id
            join month_list m on r.created_at between m.m + interval'-60 day' and m.m + interval'1 month'
            WHERE true
            GROUP BY u.id , m.m
            HAVING COUNT(*) >= 20)

--finally, use the date in the result query
SELECT avg(b.revenue_last_90), cohort_month
FROM bookings b group by cohort_month;

推荐阅读