首页 > 解决方案 > postgres查询效率问题及变量声明

问题描述

我对sql不是很熟悉。因此,我创建了一个查询,该查询生成包含特定信息的数据条目与在某一天创建的所有条目相比的百分比值。完成大约需要 20 秒,我认为这不正常,所以我认为我做得很糟糕。

另外,如何将日期“2021-06-29”声明为变量?

SELECT Round((Cast(Count(scanid) as numeric)-(Select Cast(Count(scanid) as numeric) From public.scan where scantimestamp > '2021-06-28' and scantimestamp < '2021-06-29' 
and (code isnull or length(CAST(code AS TEXT)) < 9)))*100/Cast(Count(scanid) as numeric), 2) as all,
(SELECT Round((Select Cast(Count(scanid) as numeric) From public.scan where scantimestamp > '2021-06-28' and scantimestamp < '2021-06-29' 
and length(CAST(code AS TEXT)) = 12)*100/Cast(Count(scanid) as numeric), 2)) as bbb,
(SELECT Round((Select Cast(Count(scanid) as numeric) From public.scan where scantimestamp > '2021-06-28' and scantimestamp < '2021-06-29'
and length(CAST(barcode AS TEXT)) = 9)*100/Cast(Count(scanid) as numeric), 2)) as ccc,
(SELECT Round((Select Cast(Count(scanid) as numeric) From public.scan where scantimestamp > '2021-06-28' and scantimestamp < '2021-06-29' 
and length(CAST(barcode AS TEXT)) > 12)*100/Cast(Count(scanid) as numeric), 2)) as ddd
FROM public.scan where scantimestamp > '2021-06-28' and scantimestamp < '2021-06-29'

标签: sqlpostgresql

解决方案


您的 SQL 相当混乱,但如果我正确理解了您的问题,那么您正在寻找的是根据code特定日期的字符串长度计算扫描百分比。如果是这样,我重新创建了您的示例


create table scan (code varchar, scanid int, scantimestamp timestamp);
insert into scan values ('1234567891',1,'2021-06-28 10:00:00');
insert into scan values ('12345678912345',1,'2021-06-28 09:00:00');
insert into scan values ('1234567891234561',2,'2021-06-28 15:00:00');
insert into scan values ('1234567',4,'2021-06-28 19:00:00');
insert into scan values ('123456789',3,'2021-06-28 19:00:00');
insert into scan values ('12345678',3,'2021-06-29 17:00:00');
insert into scan values ('12345679999999',5,'2021-06-30 20:00:00');

这是一个很好的数据集,以下查询还计算了code列的长度

select *,
length(code)
from public.scan 
order by scantimestamp;

结果

       code       | scanid |    scantimestamp    | length 
------------------+--------+---------------------+--------
 12345678912345   |      1 | 2021-06-28 09:00:00 |     14
 1234567891       |      1 | 2021-06-28 10:00:00 |     10
 1234567891234561 |      2 | 2021-06-28 15:00:00 |     16
 1234567          |      4 | 2021-06-28 19:00:00 |      7
 123456789        |      3 | 2021-06-28 19:00:00 |      9
 12345678         |      3 | 2021-06-29 17:00:00 |      8
 12345679999999   |      5 | 2021-06-30 20:00:00 |     14
(7 rows)

现在,如果我正确阅读了您的查询,以下应该可以解决问题

select 
    count(case when code isnull or length(code) < 9 then scanid end) *100.0 / count(scanid) as all,
    count(case when length(code) = 12 then scanid end)* 100.0 / count(scanid) as bbb,
    count(case when length(code) = 9 then scanid end) * 100.0 / count(scanid) as ccc,
    count(case when length(code) > 12 then scanid end)* 100.0 / count(scanid) as ddd
from public.scan 
where scantimestamp > '2021-06-28' and scantimestamp < '2021-06-29';

有结果

         all         |          bbb           |         ccc         |         ddd         
---------------------+------------------------+---------------------+---------------------
 20.0000000000000000 | 0.00000000000000000000 | 20.0000000000000000 | 40.0000000000000000
(1 row)

如您所见,总和不是100%因为您所做的定义并未涵盖整个数据code长度范围。

如果您想将日期作为参数始终相对于当前日期,您可以CURRENT_DATE在查询中使用

select 
    count(case when code isnull or length(code) < 9 then scanid end) *100.0 / count(scanid) as all,
    count(case when length(code) = 12 then scanid end)* 100.0 / count(scanid) as bbb,
    count(case when length(code) = 9 then scanid end) * 100.0 / count(scanid) as ccc,
    count(case when length(code) > 12 then scanid end)* 100.0 / count(scanid) as ddd
from public.scan 
where scantimestamp > CURRENT_DATE -2 and  scantimestamp < CURRENT_DATE -1;

推荐阅读