首页 > 解决方案 > 使用上个月的首次用户 ID 加入唯一的用户 ID

问题描述

我想识别在 a 月(此处:1 月)发生“first_open”事件并在 b 月(此处:2 月)返回“user_engagement”事件的用户。

我的想法: 1. 创建一个包含所有具有“first_open”事件的用户的表 2. 创建一个包含所有具有“user_engagement”事件的用户的表 3. 加入两个表的 userID 4. 计算都具有“first_open”的用户" a 月和 b 月的事件,并从 1 月开始使用 "first_open" 事件计算所有用户

通过以下查询,我目前多算了 a 月和 b 月的用户,因为我没有计算这两种事件类型的所有 unqiue 用户。


    With
    users_first_open as (select 
    user_pseudo_id,
    EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS install_month,
    event_name as firstopen
    FROM
        `table.events_*`
    where _TABLE_SUFFIX BETWEEN '20190101'
        AND '20190108' and event_name = "first_open" and 
        EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 1
    ),

    user_enagement_next_month as (select 
    user_pseudo_id,
    EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS engagement_month,
    event_name as engagament_next_month
    FROM
        `table.events_*`
    where _TABLE_SUFFIX BETWEEN '20190109'
        AND '20190116' and event_name = "user_engagement"
        and EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 1), 

    cohort_raw as(
    select 
    user_pseudo_id,
    install_month,
    engagement_month, 
    case when firstopen = "first_open" then 1 else 0 end as cohort_count_first_open, 
    case when engagament_next_month = "user_engagement" then 1 else 0 end as cohort_count_engagement
    from 
    user_enagement_next_month
    full join 
    users_first_open using (user_pseudo_id))--, 


    select
    sum(case when cohort_count_first_open is not null then 1 else 0 end) as users_first_open,
    (select sum(case when cohort_count_engagement is not null then 1 else 0 end) as u_engagement_open from cohort_raw where cohort_count_first_open = 1) as users_engagement_open
    from cohort_raw

我接下来尝试的是:按用户 ID 等在表 2“user_enagement_next_month”中分组,并在结果时创建“first_open”案例和“参与”案例的总和。然后我包含查询以仅计算这两个计数为 = 2 的用户


    With
    users_first_open as (select 
    user_pseudo_id,
    EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS install_month,
    event_name as firstopen
    FROM
        `table.events_*`
    where _TABLE_SUFFIX BETWEEN '20190101'
        AND '20190131' and event_name = "first_open" and 
        EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 1
    ),

    user_enagement_next_month as (select 
    user_pseudo_id,
    EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) AS engagement_month,
    event_name as engagament_next_month
    FROM
        `table.events_*`
    where _TABLE_SUFFIX BETWEEN '20190201'
        AND '20190228' and event_name = "session_start"
        and EXTRACT (Month FROM(DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)))) = 2
        group by 1,2,3)--,

    --cohort_raw as(
    select 
    user_pseudo_id,
    install_month,
    engagement_month, 
    case when firstopen = "first_open" then 1 else 0 end as cohort_count_first_open, 
    case when engagament_next_month = "session_start" then 1 else 0 end as cohort_count_engagement
    --case when user_pseudo_id is not null then 1 else 0 end as cohort_count_engagement
    from 
    user_enagement_next_month 
    full join 
    users_first_open using (user_pseudo_id)), 

    cohort_agg as (
    select *, cohort_count_first_open+cohort_count_engagement as cohort_sum
    from cohort_raw
    group by 1,2,3,4,5
    order by 6 desc)

    select
    (select count(*) from users_first_open) as cohort_jan,
    (select Sum(cohort_sum) from cohort_agg where cohort_sum = 2) as ret, 
    sum(case when cohort_count_first_open is not null then 1 else 0 end) as users_first_open,
    (select sum(case when cohort_count_engagement is not null then 1 else 0 end) as u_engagement_open from cohort_raw where cohort_count_first_open = 1) as users_engagement_open
    from cohort_agg 

我预计回报率在 20% 左右。我目前的输出是 54%,因为在我的查询中,我要么多计要么少计,因为我假设我的连接不起作用。

标签: firebasegoogle-bigquery

解决方案


也许我不清楚你想要什么,但试试这个

with

users_first_open as (
    select distinct  -- is there duplicates for one user_id?
        user_pseudo_id,

        extract(
            month from
            timestamp_micros(user_first_touch_timestamp)
        ) as install_month
    from
        `table.events_201901*`  -- longer prefixes generally perform better
    where
        _table_suffix between '01' and '31'
        and event_name = 'first_open'
        and extract(
                month from
                timestamp_micros(user_first_touch_timestamp)
            ) = 1
),

user_enagement_next_month as (
    select distinct
        user_pseudo_id,

        extract(
            month from
            timestamp_micros(user_first_touch_timestamp)
        ) as engagement_month
    from
        `table.events_201902*`  -- longer prefixes generally perform better
    where
        _table_suffix between '01' and '28'
        and event_name = 'user_engagement'
        and extract(
                month from
                timestamp_micros(user_first_touch_timestamp)
            ) = 2
)

select 
    ufo.install_month,
    uenm.engagement_month,
    count(*) as first_open_event_users_cnt,
    count(uenm.user_pseudo_id) as user_engagement_event_users_cnt
from 
    users_first_open as ufo
    left join user_enagement_next_month as uenm
        on ufo.user_pseudo_id = uenm.user_pseudo_id
group by
    1, 2


推荐阅读