sql - 为分组数据集选择随机值
问题描述
我不是 SQL 专家。但是我正在使用以下查询:
select count(*) as countis, avclassfamily
from malwarehashesandstrings
where behaviouralbinary IS true and
avclassfamily != 'SINGLETON'
group by avclassfamily
ORDER BY countis desc
LIMIT 50;
我想从按 avclassfamily 列分组的malwarehashsha256 列中选择 3 个随机散列。
以下查询有效,问题结束:
select count(*) as countis,avclassfamily from malwarehashesandstrings where behaviouralbinary IS true and avclassfamily != 'SINGLETON' group by avclassfamily ORDER BY countis desc LIMIT 50;
virustotal=# select m.avclassfamily, m.cnt,
array_agg(malwarehashsha256)
from (select malwarehashesandstrings.*,
count(*) over (partition by avclassfamily) as cnt,
row_number() over (partition by avclassfamily order by random()) as seqnum
from malwarehashesandstrings
where behaviouralbinary and
avclassfamily <> 'SINGLETON'
) as m
where seqnum <= 3
group by m.avclassfamily, m.cnt ORDER BY m.cnt DESC LIMIT 50;
解决方案
如果我理解正确,您可以使用row_number()
:
select m.*
from (select m.*,
row_number() over (partition by m.avclassfamily order by random()) as seqnum
from malwarehashesandstrings m
where m.behaviouralbinary and
m.avclassfamily <> 'SINGLETON'
) m
where seqnum <= 3;
如果您希望在现有查询的列中使用它,一种方法是:
select m.avgclassfamily, m.cnt,
array_agg(m.malwarehashsha256)
from (select m.*,
count(*) over (partition by m.avgclassfamily) as cnt,
row_number() over (partition by m.avclassfamily order by random()) as seqnum
from malwarehashesandstrings m
where m.behaviouralbinary and
m.avclassfamily <> 'SINGLETON'
) m
where seqnum <= 3
group by m.avgclassfamily, m.cnt;
推荐阅读
- python - 在python中处理二维数组元素索引
- vb.net - VB.net serialport.readline vs datareceived vs threading
- python - 如何有效地让 python 模块在不同的环境中工作
- java - 测试 Java 应用程序,但编译器给出了包不存在,即使它确实存在
- amazon-web-services - AWS Route 53 SOA 顶级域不匹配
- slack - 有没有办法通过非企业松弛网格的松弛 API 邀请用户到我的松弛工作区?
- java - Tomcat写权限apache服务器用户上传
- php - 在 React Native 中使用 PHP 将录制的声音片段存储在服务器上
- php - 如何将两个值从同一个选择选项发送到 php 中的 url?
- r - 如何配置对闪亮应用程序日志的访问?