sql - 自联接和效率与子查询
问题描述
假设我们有一个名为 的表 actions(date, uid, pid, action, description)
。该表的示例如下所示:
Table: actions
date uid pid action description
'2018-10-19' 1234 12 'view'
'2018-10-19' 1234 12 'report' 'SPAM'
'2018-10-19' 5678 23 'reaction' 'LOVE'
有一个表也叫 reviewers(date, rid, pid)
. 审阅者是删除帖子的人。审阅者不是用户。该表的示例如下所示:
Table: reviewers
date rid pid
'2018-10-19' 567 12
'2018-10-19' 890 45
用户查看(采取任何行动)的日常内容中有多少实际上是垃圾邮件?
以下工作是否可行:
案例一:“看”是指任何动作
select u.date, count(distinct r.pid)/count(distinct uu.pid))*100
from actions u join actions uu
on u.pid = uu.pid
inner join reviewers r
on u.pid = r.pid
where u.description = 'SPAM'
group by 1
案例 2:“看”意味着 action = 'view'
select u.date, count(distinct r.pid)/count(distinct uu.pid))*100
from actions u join actions uu
on u.pid = uu.pid
inner join reviewers r
on u.pid = r.pid
where u.description = 'SPAM'
and uu.action = 'VIEW'
group by 1
解决方案
我不确定为什么reviewers
需要考虑,或者是否可以在该表中重复 pid,但我认为这将满足您的需求(样本中的 50.0%)
select
count(distinct (case when description = 'SPAM' and r.pid IS NOT NULL then pid end)) * 100.0
/
count(distinct pid)
from actions a
left join (
select distinct pid from reviewers
) r on r.pid = a.pid
;
推荐阅读
- css - Block not behaving like
- php - 选项 Preflight 导致 POST 请求挂起
- html - 创建“查询参数”时,可以将其指定给容器标签,而不是单独指定给每个“a href”标签
- docker - Problem when running the initial test without freeradius using docker and alpine linux
- scala - 如何检测未跟踪的未来?
- python - Load dictionary items into spaCy NLP
- scala - 在 Scala 中,Inner case 类如何始终覆盖一个方法?
- c# - Web Application download from Azure Storage Blobs to Computer
- c# - Problem Mac autofit EPPLUS / Unity3d installed libgdiplus via Homebrew with no effect
- webpack - Does Webpack provide a global error hook when a chunk load fails?