sql - 如何更快地使用自联接进行查询?
问题描述
我最初尝试创建两个不同的查询,然后将它们合并到 R 中以获得累积时间图,但我试图在单个查询中获取我想要的信息。
原始代码:
users <- dbGetQuery(pool, "select id, name
from schema.table
where (name like '%t%' and name like '%2018%') or
(name like '%t%' and name like '%2017%')")
opts <- dbGetQuery(pool, "select id, name, ts
from schema.table
where name = 'qr_optin'")
all <- merge(users, opts, by = "id")
all <- all %>%
mutate(date = as.Date(all$ts),
name.x = gsub("t", "", name.x)) %>%
group_by(name.x, date) %>%
summarise(n = n())
输出如下:
name date n
x 2018-09-09 12
x 2018-09-08 5
y 2018-09-08 4
xy 2018-09-06 8
xy 2018-09-04 9
我正在尝试通过至少加入两个查询来获取信息,但我只做到了这一点,而且速度非常慢。
select f1.id, f1.name, f2.ts
from schema.table f1
left join schema.table f2 on f2.id = f1.id
where f2.name = ' qr_optin' and
(f1.name like '%t%' and f1.name like '%2018%') or
(f1.name like '%t%' and f1.name like '%2017%')
解决方案
只需在 Postgres 中运行纯 SQL 即可进行合并(即连接)或汇总(即聚合)
加入级别查询
select usrs.id, usrs.name, opts.ts
from schema.table as usrs
inner join rvv.fbm as opts
on opts.id = usrs.id and opts.name = 'qr_optin'
where (name like '%t%' and name like '%2018%') or
(name like '%t%' and name like '%2017%')
聚合查询(使用 CTE)
with cte as
(
select usrs.id, Replace(usrs.name, "t", "") as usr_name, opts.ts
from schema.table as usrs
inner join rvv.fbm as opts
on opts.id = usrs.id and opts.name = 'qr_optin'
where (name like '%t%' and name like '%2018%') or
(name like '%t%' and name like '%2017%')
)
select cte.usr_name as name, cte.ts as date, count(*) as n
from cte
group by cte.name, cte.ts
将 R 中的任一查询传递给DBI::dbGetQuery
调用。
all <- dbGetQuery(pool, "...myquery...")
推荐阅读
- python - How to make dot.product of 2D and 3D matrices (taking it for each dimension separately)
- python - Merge specific column data from multiple csv files
- openssl - 使用 OpenSSL 使用 QCStatements 创建证书
- reactjs - Mock Bing Maps API with Jest in React.js
- c++ - printf函数中的点是什么意思
- linux - How to make system() function unblocking?
- scala - How to print the content of a collection sequence
- sass - SCSS:父添加类时修改子属性
- python - Use Excel sheet to create dictionary in order to replace values
- pandas - How to put a range of columns into one column when reading a csv with pandas?