首页 > 解决方案 > 用户在网站上访问且从未返回的最后一页

问题描述

我试图找出用户访问的最后一页并且在特定时间内没有返回网站:

例如:

Date          visitstarttime    ldapid     page_1 page_2 Page_3 Page_4 
2018-08-01    1590805941       1000123         1      0       0     0
2018-07-30    1590200345       1000123         0      1       0     0
2018-07-20    1580100098       1000100         0      1       0     0
2018-07-18    1570000987       1000100         0      0       1     0
2018-07-12    1550200678       1000007         0      1       0     0
2018-07-09    1530287323       1000007         0      0       0     1

因为我试图仅查找访问页面并且在特定时间段内从未返回的用户。我期待输出如下:

Date          visitstarttime    ldapid     page_1 page_2 Page_3 Page_4 
2018-08-01    1590805941       1000123         1      0       0     0
2018-07-20    1580100098       1000100         0      1       0     0
2018-07-12    1550200678       1000007         0      1       0     0

由于我们不能在 GBQ 中同时使用 Aggregate 和 group by 函数,有没有办法解决它?

询问:

SELECT 
MAX(date) as Max_date,
Max(visitStartTime) as Max_Time,
ldapid

FROM
(

SELECT
CAST(CONCAT(SUBSTR(date,0,4), '-', SUBSTR(date,5,2),'-',SUBSTR(date,7,2)) AS date ) AS date,
        visitStartTime,
    --  fullVisitorId,
   -- hit.page.pagePath AS pagepath,
    (
            SELECT
            x.value
            FROM
            UNNEST(hit.customDimensions) x
            WHERE
            x.index = 9) 
        AS ldapid,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/recommendations-and-references%', 
    1, 
    0))
    FROM 
    UNNEST(hits))
    AS Recommendation_References,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/interview-sign-up%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Interview_Sign_Up,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/transcript-setup%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Transcript_Setup,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/transcript-upload%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Transcript_Upload,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/pre-interview-questions%',
    1,
    0))
    FROM
    UNNEST(hits))
    AS Pre_Interview_Questions,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/interview-prep%',
    1,
    0))
    FROM 
    UNNEST(hits))
    AS Interview_Prep,
    (
    SELECT
    MAX(
    IF
    (page.pagePath LIKE '%/applicant-center/#interview/interview-day-details%',
    1,
    0))
    FROM
    UNNEST(hits))
    AS Interview_day_details
    FROM
    `tfa-big-query.74006564.ga_sessions_*`,
    UNNEST(hits) AS hit
    WHERE
  _TABLE_SUFFIX BETWEEN '20190801' AND '20200529'

  GROUP BY
    date,
        visitStartTime,
        --fullVisitorId,
   -- pagepath,
    ldapid,
    Recommendation_References,
    Interview_Sign_Up,
    Transcript_Setup,
    Transcript_Upload,
    Pre_Interview_Questions,
    Interview_Prep,
    Interview_day_details
    ORDER BY visitStartTime DESC, date DESC

  )
  WHERE 
 ( Recommendation_References >= 1
  OR
  Interview_Sign_Up >= 1
  OR 
  Transcript_Setup >= 1
  OR
  Transcript_Upload>= 1
  OR
  Pre_Interview_Questions >= 1
  OR
  Interview_Prep >= 1
  OR
  Interview_day_details >= 1) and ldapid IS NOT NULL

  GROUP BY
  Max_date,
  Max_Time,
  ldapid

由于我们不能在 GBQ 中同时使用 Aggregate 和 group by 函数,有没有办法解决它?

标签: sqlgoogle-analyticsgoogle-bigquery

解决方案


我认为您的问题与问题中的查询之间没有关系。但根据您的问题和示例数据,请使用lead()

select t.*
from (select t.*,
             lead(visitstarttime) over (partition by ldapid order by visitstarttime) as next_visitstarttime
      from t
     ) t
where next_visitstarttime is null or next_visitstarttime > visitstarttime + <whatever>

推荐阅读