首页 > 解决方案 > 如何更快地使用自联接进行查询?

问题描述

我最初尝试创建两个不同的查询,然后将它们合并到 R 中以获得累积时间图,但我试图在单个查询中获取我想要的信息。

原始代码:

users <- dbGetQuery(pool, "select id, name
                    from schema.table
                    where (name like '%t%' and name like '%2018%') or
                    (name like '%t%' and name like '%2017%')")
opts <- dbGetQuery(pool, "select id, name, ts
                    from schema.table
                    where name = 'qr_optin'")

all <- merge(users, opts, by = "id")

all <- all %>% 
  mutate(date =  as.Date(all$ts),
         name.x = gsub("t", "", name.x)) %>% 
  group_by(name.x, date) %>% 
  summarise(n = n()) 

输出如下:

name          date         n 
x          2018-09-09      12
x          2018-09-08      5
y          2018-09-08      4
xy         2018-09-06      8
xy         2018-09-04      9

我正在尝试通过至少加入两个查询来获取信息,但我只做到了这一点,而且速度非常慢。

select f1.id, f1.name, f2.ts
from schema.table f1
left join schema.table f2 on f2.id = f1.id
where f2.name = ' qr_optin' and
(f1.name like '%t%' and f1.name like '%2018%') or
(f1.name like '%t%' and f1.name like '%2017%')

标签: sqlrpostgresqlself-join

解决方案


只需在 Postgres 中运行纯 SQL 即可进行合并(即连接)或汇总(即聚合)

加入级别查询

select usrs.id, usrs.name, opts.ts
from schema.table as usrs
inner join rvv.fbm as opts 
        on opts.id = usrs.id and opts.name = 'qr_optin'
where (name like '%t%' and name like '%2018%') or
      (name like '%t%' and name like '%2017%')

聚合查询(使用 CTE)

with cte as 
  ( 
    select usrs.id, Replace(usrs.name, "t", "") as usr_name, opts.ts
    from schema.table as usrs
    inner join rvv.fbm as opts 
            on opts.id = usrs.id and opts.name = 'qr_optin'
    where (name like '%t%' and name like '%2018%') or
          (name like '%t%' and name like '%2017%')
  )

select cte.usr_name as name, cte.ts as date, count(*) as n
from cte
group by cte.name, cte.ts

将 R 中的任一查询传递给DBI::dbGetQuery调用。

all <- dbGetQuery(pool, "...myquery...")

推荐阅读