sql - oracle中通过添加索引进行性能调优
问题描述
有一个查询我想提高性能。这些表有近 1000 万条记录。这两个表的现有索引如下所示。
DISB_A 表索引:
INDEX_OWNER INDEX_NAME TABLE_OWNER TABLE_NAME COLUMN_NAME COLUMN_POSITION COLUMN_LENGTH CHAR_LENGTH DESCEND COLLATED_COLUMN_ID
-------------------------------------------------------------------------------------------------------------------------------------------
DEVL DISB_A_INDX1 DEVL DISB_A V_ID 1 22 0 ASC
DEVL DISB_A_INDX1 DEVL DISB_A SEQNBR 2 22 0 ASC
DEVL DISB_A_INDX2 DEVL DISB_A ND_ID 1 22 0 ASC
DEVL DISB_A_INDX2 DEVL DISB_A DUE_DATE 2 7 0 ASC
DEVL DISB_A_INDX3 DEVL DISB_A WL_ID 1 22 0 ASC
DEVL DISB_A_INDX4 DEVL DISB_A CODE 1 8 8 ASC
DEVL DISB_A_PRIME DEVL DISB_A ID 1 22 0 ASC
DISB_B 表索引:
INDEX_OWNER INDEX_NAME TABLE_OWNER TABLE_NAME COLUMN_NAME COLUMN_POSITION COLUMN_LENGTH CHAR_LENGTH DESCEND COLLATED_COLUMN_ID
-------------------------------------------------------------------------------------------------------------------------------------
DEVL DISB_B_INDX1 DEVL DISB_B DSBA_ID 1 22 0 ASC
DEVL DISB_B_INDX1 DEVL DISB_B SEQNBR 2 22 0 ASC
DEVL DISB_B_PRIME DEVL DISB_B DSBA_ID 1 22 0 ASC
DEVL DISB_B_PRIME DEVL DISB_B ND_ID 2 22 0 ASC
DEVL DISB_B_PRIME DEVL DISB_B A_ID 3 13 13 ASC
DEVL DISB_B_PRIME DEVL DISB_B DIO_ID 4 6 6 ASC
DEVL DISB_B_PRIME DEVL DISB_B AIO_QUAL 5 22 0 ASC
DEVL DISB_B_PRIME DEVL DISB_B DMT_CODE 6 3 3 ASC
DEVL DISB_B_PRIME DEVL DISB_B DMT_SEQNBR 7 22 0 ASC
DEVL DISB_B_PRIME DEVL DISB_B ORDER_SEQNBR 8 22 0 ASC
DEVL DISB_B_PRIME DEVL DISB_B NBR 9 20 20 ASC
当我检查解释计划时,使用现有索引,成本非常高,并且记录过去需要接近 1 秒到 2 秒才能显示,并且操作曾经是全表扫描。请在下面找到详细信息。
SQL> explain plan
for
SELECT SUM ( NVL ( dd.req_amt , 0 ) )
FROM DISB_A db ,
DISB_B dd
WHERE db.id = dd.dsba_id
AND dd.nd_id = xxxxxxxx
AND dd.a_id = 'xx-xx'
AND DECODE (db.v_id , xxxxxxxxx , 'COMPLETE' , db.code ) = 'COMPLETE'
AND db.effdate BETWEEN TRUNC ( SYSDATE , 'YEAR' ) AND SYSDATE;
Explained.
Elapsed: 00:00:00.04
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
Plan hash value: 4272128008
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 58 | 62271 (1)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | 58 | | |
|* 2 | FILTER | | | | | |
| 3 | NESTED LOOPS | | 1 | 58 | 62271 (1)| 00:00:03 |
| 4 | NESTED LOOPS | | 10 | 58 | 62271 (1)| 00:00:03 |
|* 5 | TABLE ACCESS STORAGE FULL | DISB_A | 10 | 300 | 62231 (1)| 00:00:03 |
|* 6 | INDEX RANGE SCAN | DISB_B_PRIME | 1 | | 3 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| DISB_B | 1 | 28 | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(SYSDATE@!>=TRUNC(SYSDATE@!,'fmyear'))
5 - storage(DECODE("DB"."EV_ID",1263421688,'COMPLETE',"DB"."STATUS_CODE")='COMPLETE'
AND "DB"."EFFDATE">=TRUNC(SYSDATE@!,'fmyear') AND "DB"."EFFDATE"<=SYSDATE@!)
filter(DECODE("DB"."EV_ID",1263421688,'COMPLETE',"DB"."STATUS_CODE")='COMPLETE'
AND "DB"."EFFDATE">=TRUNC(SYSDATE@!,'fmyear') AND "DB"."EFFDATE"<=SYSDATE@!)
6 - access("DB"."ID"="DD"."DSBA_ID" AND "DD"."IND_ID"=20972265 AND
"DD"."GA_ID"='150563-01')
25 rows selected.
Elapsed: 00:00:00.02
SQL> SELECT SUM ( NVL ( dd.req_amt , 0 ) )
FROM DISB_A db ,
DISB_B dd
WHERE db.id = dd.dsba_id
AND dd.nd_id = xxxxxxxx
AND dd.a_id = 'xx-xx'
AND DECODE (db.v_id , xxxxxxxxx , 'COMPLETE' , db.code ) = 'COMPLETE'
AND db.effdate BETWEEN TRUNC ( SYSDATE , 'YEAR' ) AND SYSDATE;
SUM(NVL(DD.REQ_AMT,0))
______________________
62500
**1 row selected.
Elapsed: 00:00:01.65**
所以在上一篇文章中,我被要求添加另一个索引,这样它会降低成本并且查询可能会更快。请在下面找到我添加的索引。
**CREATE INDEX DISB_A_INDX5 ON DISB_A (ID,IND_id,EV_ID,status_code,EFFDATE ASC) NOPARALLEL;**
PLAN_TABLE_OUTPUT
Plan hash value: 2535999045
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 58 | 18669 (1)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 58 | | |
|* 2 | FILTER | | | | | |
| 3 | NESTED LOOPS | | 1 | 58 | 18669 (1)| 00:00:01 |
| 4 | NESTED LOOPS | | 35 | 58 | 18669 (1)| 00:00:01 |
|* 5 | INDEX STORAGE FAST FULL SCAN| DISB_A_INDX5 | 35 | 1050 | 18598 (1)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | DISB_B_PRIME | 1 | | 3 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID | DISB_B | 1 | 28 | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(SYSDATE@!>=TRUNC(SYSDATE@!,'fmyear'))
5 - storage(DECODE("DB"."EV_ID",1263421688,'COMPLETE',"DB"."STATUS_CODE")='COMPLETE' AND
"DB"."EFFDATE"<=SYSDATE@! AND "DB"."EFFDATE">=TRUNC(SYSDATE@!,'fmyear'))
filter(DECODE("DB"."EV_ID",1263421688,'COMPLETE',"DB"."STATUS_CODE")='COMPLETE' AND
"DB"."EFFDATE"<=SYSDATE@! AND "DB"."EFFDATE">=TRUNC(SYSDATE@!,'fmyear'))
6 - access("DB"."ID"="DD"."DSBA_ID" AND "DD"."IND_ID"=20972265 AND
"DD"."GA_ID"='150563-01')
25 rows selected.
Elapsed: 00:00:00.03
SQL> SELECT SUM ( NVL ( dd.req_amt , 0 ) )
FROM DISB_A db ,
DISB_B dd
WHERE db.id = dd.dsba_id
AND dd.nd_id = xxxxxxxx
AND dd.a_id = 'xx-xx'
AND DECODE (db.v_id , xxxxxxxxx , 'COMPLETE' , db.code ) = 'COMPLETE'
AND db.effdate BETWEEN TRUNC ( SYSDATE , 'YEAR' ) AND SYSDATE;
SUM(NVL(DD.REQ_AMT,0))
______________________
62500
**1 row selected.
Elapsed: 00:00:05.45**
问题 :
当成本实际上降低时,查询应该来得更快。但是查询需要 6 秒才能得到结果。您能否解释一下我是否做错了什么/有没有其他方法可以提高性能。
在添加附加索引之前,在 FULL TABLE SCAN 期间选取的行数为 10 行,但在添加附加索引后,行数从 10 跃升至 35。
解决方案
基于这样的假设,即扩展可能不会通过仅替换解码而自行发生。并假设驱动表应该是DISB_A
(驱动查询会容易得多,DISB_B
所以让我们知道独立过滤器DISB_B
是否能有效减少访问的行数)。我会将查询重写为(可能有错别字):
select sum(req_amt)
from (
SELECT SUM ( dd.req_amt ) req_amt
FROM DISB_A db ,
DISB_B dd
WHERE db.id = dd.dsba_id
AND dd.nd_id = :xxxxxxxx
AND dd.a_id = 'xx-xx'
AND db.effdate BETWEEN TRUNC ( SYSDATE , 'YEAR' ) AND SYSDATE
and db.v_id = :xxxxxxxxx
union all
SELECT SUM ( dd.req_amt ) req_amt
FROM DISB_A db ,
DISB_B dd
WHERE db.id = dd.dsba_id
AND dd.nd_id = :xxxxxxxx
AND dd.a_id = 'xx-xx'
AND db.effdate BETWEEN TRUNC ( SYSDATE , 'YEAR' ) AND SYSDATE
and (db.v_id is null and :xxxxxxxxx is null)
union all
SELECT SUM ( dd.req_amt ) req_amt
FROM DISB_A db ,
DISB_B dd
WHERE db.id = dd.dsba_id
AND dd.nd_id = :xxxxxxxx
AND dd.a_id = 'xx-xx'
AND db.effdate BETWEEN TRUNC ( SYSDATE , 'YEAR' ) AND SYSDATE
and db.code = 'COMPLETE'
and lnnvl(db.v_id = :xxxxxxxxx )
)
并创建以下索引:
create index DISB_A_idx_v on DISB_A (v_id, effdate);
create index DISB_A_idx_code on DISB_A (code, effdate);
create index DISB_B_idx on DISB_B (nd_id,a_id,dsba_id);
如果这仍然相对较慢,那么就这些过滤器的实际选择性提供一些反馈会很有帮助。看看如何在计划中获取行源执行统计信息(https://ctandrewsayer.wordpress.com/2017/03/21/4-easy-lessons-to-enhance-your-performance-diagnostics/的第 4 点) 并将该信息包含在您的原始问题中。
推荐阅读
- javascript - 在学习教程时无法将前端连接到后端
- android - React Native npm run android 正在手机中安装旧的 apk 并寻找用于生成离线 apk 的自动化工具
- octobercms - 根据类别 1 设置唯一验证
- android - Android 无法使用签名的 url 将文件上传到谷歌云存储
- c - 关于猜词游戏制作高分表的问题
- .net-core - 如何将 net core 项目发布到 AnyCPU?
- php - 如何在我的 php 代码上添加路径目录,因为当我运行代码并上传文件时,我在第 29 行收到错误
- c++ - c++ 三元运算符期望表达式C/C++(29)
- python - NameError:在封闭范围内赋值之前引用了自由变量“mywidget”
- javascript - Javascript - 从两个数组创建嵌套的对象数组