首页 > 解决方案 > 同一张表上的 SQL 三重连接很慢

问题描述

我有一个如下所示的表格

 CREATE TABLE `xcpRush2_SandraTriplets` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
  `idConceptStart` int(11) NOT NULL,
  `idConceptLink` int(11) NOT NULL,
  `idConceptTarget` int(11) NOT NULL,
  `flag` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_name` (`idConceptStart`,`idConceptLink`,`idConceptTarget`),
  KEY `idConceptStart` (`idConceptStart`,`idConceptLink`,`idConceptTarget`),
  KEY `idConceptStart_4` (`idConceptStart`),
  KEY `idConceptTarget` (`idConceptTarget`),
  KEY `idConceptLink` (`idConceptLink`,`idConceptTarget`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

数据看起来像 db fiddle :https://www.db-fiddle.com/f/ejXP7qgvwNqAZeuaN3DFNz/3

如您所见,它在几列上完全索引。

在我的桌子上,我有大约 800k idConceptStart 满足条件

idConceptLink = 5 idConceptTarget = 14500 AND
idConceptLink = 3 idConceptLink = 14504 AND
idConceptLink = 12 idConceptLink = 11

当我执行这个查询

SELECT * FROM  xcpRush2_SandraTriplets l    
   JOIN  xcpRush2_SandraTriplets link1 ON link1.idConceptStart = l.idConceptStart  
   JOIN  xcpRush2_SandraTriplets link2 ON link2.idConceptStart = link1.idConceptStart

    WHERE 
       l.idConceptLink = 5  AND 
       l.idConceptTarget = 14500 AND 
       l.flag != 1 AND 

       link2.flag != 1 AND 
       link2.idConceptLink = 3 AND 
       link2.idConceptTarget = 14504 AND 

       link1.flag != 1 AND 
       link1.idConceptTarget = 12 AND 
       l.idConceptLink = 11  

    ORDER BY l.idConceptStart DESC  LIMIT 10 

这是SQL解释 SQL 解释

查询大约需要 30 秒 (!) 来呈现

但是如果我删除这个(只有这个)

 link2.idConceptLink = 3 AND link2.idConceptTarget =14504

然后查询需要 20 毫秒来呈现

    SELECT * FROM  xcpRush2_SandraTriplets l    
   JOIN  xcpRush2_SandraTriplets link1 ON link1.idConceptStart = l.idConceptStart  
   JOIN  xcpRush2_SandraTriplets link2 ON link2.idConceptStart = l.idConceptStart 
   WHERE 
      l.idConceptLink = 5 AND 
      l.idConceptTarget = 14500 AND 
      l.flag != 1  AND 

      link2.flag != 1 AND       

      link1.flag != 1 AND 
      link1.idConceptTarget = 12 AND 
      link1.idConceptLink = 11  

    ORDER BY l.idConceptStart DESC  LIMIT 10 

SQL 解释

我很困惑,因为该表已被索引idConceptLinkidConceptTarget并且单独进行的每个查询都非常快,可以渲染 < 20 ms

查询中的每个idConceptLink,idConceptTarget对都返回大量行(不仅是 link2.idConceptLink = 3 AND link2.idConceptTarget =14504)

你能帮我找出瓶颈吗?

编辑

在评论中有更多发现之后,问题似乎出在 ORDER BY 上。取决于我是否加入 l.idConceptStart 或 link1.idConceptStart 或 link2.idConceptStart 查询很慢。在我的实际情况下 ORDER BY link2.idConceptStart很慢。

索引结构如下

CREATE TABLE `xcpRush2_SandraTriplets` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `idConceptStart` int(11) NOT NULL,
  `idConceptLink` int(11) NOT NULL,
  `idConceptTarget` int(11) NOT NULL,
  `flag` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_name` (`idConceptStart`,`idConceptLink`,`idConceptTarget`),
  KEY `idConceptStart` (`idConceptStart`),
  KEY `idConceptTarget` (`idConceptTarget`),
  KEY `idConceptLink` (`idConceptLink`,`idConceptTarget`)
) ENGINE=InnoDB AUTO_INCREMENT=5747878 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

索引是

xcprush2_sandratriplets 0   PRIMARY 1   id  A   5207892 NULL    NULL        BTREE       
xcprush2_sandratriplets 0   idx_name    1   idConceptStart  A   1243366 NULL    NULL        BTREE       
xcprush2_sandratriplets 0   idx_name    2   idConceptLink   A   5207936 NULL    NULL        BTREE       
xcprush2_sandratriplets 0   idx_name    3   idConceptTarget A   5207936 NULL    NULL        BTREE       
xcprush2_sandratriplets 1   idConceptStart  1   idConceptStart  A   1122352 NULL    NULL        BTREE       
xcprush2_sandratriplets 1   idConceptTarget 1   idConceptTarget A   123870  NULL    NULL        BTREE       
xcprush2_sandratriplets 1   idConceptLink   1   idConceptLink   A   5   NULL    NULL        BTREE       
xcprush2_sandratriplets 1   idConceptLink   2   idConceptTarget A   154480  NULL    NULL        BTREE

索引

当我这样做时查询很

 SELECT  l.idConceptStart, l.idConceptLink, l.`idConceptTarget` FROM  xcpRush2_SandraTriplets l  JOIN  xcpRush2_SandraTriplets link1 ON link1.idConceptStart = l.idConceptStart  JOIN  xcpRush2_SandraTriplets link2 ON link2.idConceptStart = l.idConceptStart 
    WHERE l.idConceptLink = 5  
    AND l.idConceptTarget = 14500
    AND l.flag != 1 
     AND link1.flag != 1 AND 
            link1.idConceptTarget =14504 AND link1.idConceptLink = 3 AND link2.flag != 1 AND 
            link2.idConceptTarget =12 AND link2.idConceptLink = 11  ORDER BY  link2.idConceptStart DESC  LIMIT 1000 OFFSET 0

这是解释结构

1   SIMPLE  link1   NULL    ref idx_name,idConceptStart,idConceptTarget,idConceptLink   idConceptTarget 4   const   1611256 18.00   Using where; Using temporary; Using filesort
1   SIMPLE  l   NULL    eq_ref  idx_name,idConceptStart,idConceptTarget,idConceptLink   idx_name    12  sandra.link1.idConceptStart,const,const 1   90.00   Using where
1   SIMPLE  link2   NULL    eq_ref  idx_name,idConceptStart,idConceptTarget,idConceptLink   idx_name    12  sandra.link1.idConceptStart,const,const 1   90.00   Using where

慢解释

当我这样做时查询很快

SELECT  l.idConceptStart, l.idConceptLink, l.`idConceptTarget` FROM  xcpRush2_SandraTriplets l  JOIN  xcpRush2_SandraTriplets link1 ON link1.idConceptStart = l.idConceptStart  JOIN  xcpRush2_SandraTriplets link2 ON link2.idConceptStart = l.idConceptStart 
    WHERE l.idConceptLink = 5  
    AND l.idConceptTarget = 14500
    AND l.flag != 1 
     AND link1.flag != 1 AND 
            link1.idConceptTarget =14504 AND link1.idConceptLink = 3 AND link2.flag != 1 AND 
            link2.idConceptTarget =12 AND link2.idConceptLink = 11  ORDER BY  l.idConceptStart DESC  LIMIT 1000 OFFSET 0

这是解释结构

   1    SIMPLE  l   NULL    index   idx_name,idConceptStart,idConceptTarget,idConceptLink   idConceptStart  4   NULL    13036   3.08    Using where
1   SIMPLE  link1   NULL    eq_ref  idx_name,idConceptStart,idConceptTarget,idConceptLink   idx_name    12  sandra.l.idConceptStart,const,const 1   90.00   Using where
1   SIMPLE  link2   NULL    eq_ref  idx_name,idConceptStart,idConceptTarget,idConceptLink   idx_name    12  sandra.l.idConceptStart,const,const 1   90.00   Using where

快速解释

编辑 2

要排序的最佳表似乎是随机的。现在我在几个小时后运行了相同的查询(发生了一些插入),但是使用相同的查询解决键顺序的结构发生了变化。快查询变成慢查询,慢查询变成快查询。如果我 ORDER BY l.idConceptStart 以下说明

解释编辑 2

键的表解析顺序似乎是随机的。我完全迷路了。最后,我唯一需要的是先获取最后一个数据库条目

标签: mysqlsqlperformance

解决方案


“完全索引”——不。你有一些索引,包括一些冗余的。

这可能是您查询的最佳索引:

INDEX(link, target, start)

我们来谈谈flag。它有多少不同的值?如果只有 2(例如,0 和 1),则更改为flag = 0而不是flag != 1。优化器在处理=测试方面比!=. 并更改为INDEX(link, target, flag, start).

flag=1 的行的百分比是多少?这可能会导致更多的想法。

你有UNIQUE钥匙,外加一个代理人id?您是否id从任何其他表格中引用?如果没有,请摆脱它,并将其提升UNIQUEPRIMARY KEY. 但那时,我希望重新排列该 PK 中的列以符合我的建议。

建立索引的一些规则:

  • 将使用=first (linktarget, 以任意顺序测试的列)
  • 如果您希望索引也包含列,最好让索引完全处理WHERE(正在停止)。如果还有.!=ORDER BYLIMIT
  • UNIQUE(a,b,c)排除了需要INDEX(a,b,c)
  • INDEX(a,b)排除了对INDEX(a).
  • 更多:http: //mysql.rjweb.org/doc.php/index_cookbook_mysql

推荐阅读