首页 > 解决方案 > MariaDB 中选择额外列时 SQL 响应缓慢

问题描述

注意以下两个查询,它们之间的区别被突出显示

查询一:

SELECT  tasks.TaskID, tasks.CardID, tasks.CritPath, tasks.ReworkCount,
        cs.WorkflowID,
         cs.StageCode,    -- This is added
        tasks.CurrentEscalationLevel,
        tasks.Title, tasks.Description, tasks.EscalationDelay,
        tasks.StartDate, tasks.EndDate, tasks.OriginalStartDate,
        tasks.OriginalEndDate, tasks.Priority, tasks.Duration,
        tasks.Status
    FROM  Tasks tasks
    INNER JOIN  CardsSettings cs  ON cs.CardID = tasks.TaskID
    INNER JOIN  
    (
        SELECT  t1.WorkflowID
            FROM  
            (
                SELECT  WorkflowID
                    from  Workflow
                    Where  IsWFActive = "YES"
                      and  LastUpdatedDateTime BETWEEN "2018-11-21 23:59:59" AND "2019-11-21 23:59:59"
                      AND  WorkflowTypeID = 9
            ) t1
            INNER JOIN  
            (
                SELECT  formSubCardSettings.WorkflowID, cfsm.Value as
                    Value
                     FROM  CardsSettings AS formSubCardSettings
                    INNER JOIN  custom_form_submissions cfs  ON cfs.FormSubmissionID = formSubCardSettings.CardID
                      AND  cfs.IsHistory = 'NO'
                    INNER JOIN  custom_form cf  ON cf.FormID = cfs.FormID
                    INNER JOIN  custom_form_metadata cfm  ON cfm.FormID = cf.FormID
                    INNER JOIN  custom_form_submissions_metadata cfsm  ON cfsm.FormSubmissionID = cfs.FormSubmissionID
                      AND  cfsm.FormMetaID = cfm.FormMetaID
                    INNER JOIN  Workflow cfwf  ON cfwf.WorkflowID = formSubCardSettings.WorkflowID
                    WHERE  cf.FormTitle = "Project Initiation"
                      AND  cfm.FieldLabel = "wid"
                      AND  cfwf.WorkflowTypeID = 9
                      AND  (cfsm.Value IN("413"))) t2  ON t1.WorkflowID = t2.WorkflowID
    ) a  ON a.WorkflowID = cs.WorkflowID;

查询 B:

选择 tasks.TaskID、tasks.CardID、tasks.CritPath、tasks.ReworkCount、cs.WorkflowID、tasks.CurrentEscalationLevel、tasks.Title、tasks.Description、tasks.EscalationDelay、tasks.StartDate、tasks.EndDate、tasks.OriginalStartDate、tasks .OriginalEndDate,tasks.Priority,tasks.Duration,tasks.Status FROM Tasks tasks INNER JOIN CardsSettings cs ON cs.CardID = tasks.TaskID INNER JOIN ( SELECT t1.WorkflowID FROM (SELECT WorkflowID from Workflow Where IsWFActive = "YES" and LastUpdatedDateTime在“2018-11-21 23:59:59”和“2019-11-21 23:59:59”之间且 WorkflowTypeID = 9) t1 INNER JOIN (SELECT formSubCardSettings.WorkflowID, cfsm.Value as Value FROM CardsSettings AS formSubCardSettings INNER加入 custom_form_submissions cfs ON cfs.FormSubmissionID = formSubCardSettings.CardID AND cfs。IsHistory = 'NO' INNER JOIN custom_form cf ON cf.FormID = cfs.FormID INNER JOIN custom_form_metadata cfm ON cfm.FormID = cf.FormID INNER JOIN custom_form_submissions_metadata cfsm ON cfsm.FormSubmissionID = cfs.FormSubmissionID AND cfsm.FormMetaID = cfm.FormMetaID INNER加入工作流 cfwf ON cfwf.WorkflowID = formSubCardSettings.WorkflowID WHERE cf.FormTitle = "Project Initiation" AND cfm.FieldLabel = "wid" AND cfwf.WorkflowTypeID = 9 AND (cfsm.Value IN("413"))) t2 ON t1 .WorkflowID = t2.WorkflowID) a ON a.WorkflowID = cs.WorkflowID;FormMetaID = cfm.FormMetaID INNER JOIN 工作流 cfwf ON cfwf.WorkflowID = formSubCardSettings.WorkflowID WHERE cf.FormTitle = "项目启动" AND cfm.FieldLabel = "wid" AND cfwf.WorkflowTypeID = 9 AND (cfsm.Value IN("413" ))) t2 ON t1.WorkflowID = t2.WorkflowID) a ON a.WorkflowID = cs.WorkflowID;FormMetaID = cfm.FormMetaID INNER JOIN 工作流 cfwf ON cfwf.WorkflowID = formSubCardSettings.WorkflowID WHERE cf.FormTitle = "项目启动" AND cfm.FieldLabel = "wid" AND cfwf.WorkflowTypeID = 9 AND (cfsm.Value IN("413" ))) t2 ON t1.WorkflowID = t2.WorkflowID) a ON a.WorkflowID = cs.WorkflowID;

StageCode列是 varchar(1024)。

我在两个独立的数据库引擎上有完全相同的数据库和架构,一个是 mysql (5.6),另一个是 mariadb (10.4)

在 mariadb 上运行查询会导致非常不同的响应时间:

查询 A:大约 5 秒

查询 B:约 0.2 秒

只需从 cs 添加一个额外的字段,响应时间就会增加很多倍。这不是一次性行为,因为它每次都会发生。另请注意,两个查询都选择了 cs.WorkflowID。

在 mysql 上运行上述查询会产生快速响应,两者都在不到 0.1 秒内返回。

两个数据库引擎上有问题的查询 A 的解释如下:

mysql:

+----+-------------+---------------------+--------+----------------------------+------------------+---------+------------------------------------------------------+-------+-------------+
| id | select_type | table               | type   | possible_keys              | key              | key_len | ref                                                  | rows  | Extra       |
+----+-------------+---------------------+--------+----------------------------+------------------+---------+------------------------------------------------------+-------+-------------+
|  1 | PRIMARY     | tasks               | ALL    | PRIMARY                    | NULL             | NULL    | NULL                                                 |  3091 | NULL        |
|  1 | PRIMARY     | cs                  | eq_ref | PRIMARY,FK_CS_WFID_WF_WFID | PRIMARY          | 4       | zestlTitan_3000037868.tasks.TaskID                   |     1 | Using where |
|  1 | PRIMARY     | <derived2>          | ref    | <auto_key0>                | <auto_key0>      | 4       | zestlTitan_3000037868.cs.WorkflowID                  |    10 | Using index |
|  2 | DERIVED     | <derived3>          | ALL    | NULL                       | NULL             | NULL    | NULL                                                 |    81 | NULL        |
|  2 | DERIVED     | <derived4>          | ref    | <auto_key0>                | <auto_key0>      | 5       | t1.WorkflowID                                        |   715 | NULL        |
|  4 | DERIVED     | cfs                 | ALL    | PRIMARY,FormID             | NULL             | NULL    | NULL                                                 | 17888 | Using where |
|  4 | DERIVED     | cf                  | eq_ref | PRIMARY                    | PRIMARY          | 4       | zestlTitan_3000037868.cfs.FormID                     |     1 | Using where |
|  4 | DERIVED     | formSubCardSettings | eq_ref | PRIMARY,FK_CS_WFID_WF_WFID | PRIMARY          | 4       | zestlTitan_3000037868.cfs.FormSubmissionID           |     1 | Using where |
|  4 | DERIVED     | cfwf                | eq_ref | PRIMARY,FK_WTID_WFT_WTID   | PRIMARY          | 4       | zestlTitan_3000037868.formSubCardSettings.WorkflowID |     1 | Using where |
|  4 | DERIVED     | cfsm                | ref    | FormSubmissionID           | FormSubmissionID | 4       | zestlTitan_3000037868.cfs.FormSubmissionID           |     4 | Using where |
|  4 | DERIVED     | cfm                 | eq_ref | PRIMARY,FormID             | PRIMARY          | 4       | zestlTitan_3000037868.cfsm.FormMetaID                |     1 | Using where |
|  3 | DERIVED     | Workflow            | ref    | FK_WTID_WFT_WTID           | FK_WTID_WFT_WTID | 4       | const                                                |    81 | Using where |
+----+-------------+---------------------+--------+----------------------------+------------------+---------+------------------------------------------------------+-------+-------------+
12 rows in set (0.00 sec)

在 mariadb 上:

+------+-------------+---------------------+--------+----------------------------+--------------------+---------+--------------------------------------------------+------+-------------+
| id   | select_type | table               | type   | possible_keys              | key                | key_len | ref                                              | rows | Extra       |
+------+-------------+---------------------+--------+----------------------------+--------------------+---------+--------------------------------------------------+------+-------------+
|    1 | SIMPLE      | tasks               | ALL    | PRIMARY                    | NULL               | NULL    | NULL                                             | 3072 |             |
|    1 | SIMPLE      | cs                  | eq_ref | PRIMARY,FK_CS_WFID_WF_WFID | PRIMARY            | 4       | zestlTitan_3000037868.tasks.TaskID               | 1    | Using where |
|    1 | SIMPLE      | Workflow            | eq_ref | PRIMARY,FK_WTID_WFT_WTID   | PRIMARY            | 4       | zestlTitan_3000037868.cs.WorkflowID              | 1    | Using where |
|    1 | SIMPLE      | cfwf                | eq_ref | PRIMARY,FK_WTID_WFT_WTID   | PRIMARY            | 4       | zestlTitan_3000037868.cs.WorkflowID              | 1    | Using where |
|    1 | SIMPLE      | formSubCardSettings | ref    | PRIMARY,FK_CS_WFID_WF_WFID | FK_CS_WFID_WF_WFID | 5       | zestlTitan_3000037868.cs.WorkflowID              | 219  | Using index |
|    1 | SIMPLE      | cfs                 | eq_ref | PRIMARY,FormID             | PRIMARY            | 4       | zestlTitan_3000037868.formSubCardSettings.CardID | 1    | Using where |
|    1 | SIMPLE      | cf                  | eq_ref | PRIMARY                    | PRIMARY            | 4       | zestlTitan_3000037868.cfs.FormID                 | 1    | Using where |
|    1 | SIMPLE      | cfsm                | ref    | FormSubmissionID           | FormSubmissionID   | 4       | zestlTitan_3000037868.formSubCardSettings.CardID | 4    | Using where |
|    1 | SIMPLE      | cfm                 | eq_ref | PRIMARY,FormID             | PRIMARY            | 4       | zestlTitan_3000037868.cfsm.FormMetaID            | 1    | Using where |
+------+-------------+---------------------+--------+----------------------------+--------------------+---------+--------------------------------------------------+------+-------------+
9 rows in set (0.002 sec)

有人可以确定出了什么问题吗?两台机器都有相似的资源(RAM/磁盘)

编辑1:

如果我使用另一个 CardsSettings(cs) 字段而不是 StageCode(Varchar 1024),则会出现相同的奇怪行为,该字段是 tinyint 或 int(10)。所以我怀疑这与长列有关。但是,CardsSettings 表中有几个 Text 列,还有几个 varchar(1024) 和一个 varchar(4096)。

编辑2:

单独查询 t2 的子查询仅需 0.05 秒:

SELECT formSubCardSettings.WorkflowID FROM CardsSettings AS formSubCardSettings INNER JOIN custom_form_submissions cfs ON cfs.FormSubmissionID = formSubCardSettings.CardID AND cfs.IsHistory = 'NO' INNER JOIN custom_form cf ON cf.FormID = cfs.FormID INNER JOIN custom_form_metadata cfm ON cfm.FormID = cf.FormID INNER JOIN custom_form_submissions_metadata cfsm ON cfsm.FormSubmissionID = cfs.FormSubmissionID AND cfsm.FormMetaID = cfm.FormMetaID INNER JOIN Workflow cfwf ON cfwf.WorkflowID = formSubCardSettings.WorkflowID WHERE cf.FormTitle = "Project Initiation" AND cfm.FieldLabel = "wid" AND cfwf.WorkflowTypeID = 9 AND (cfsm.Value IN("413"));
+------------+
| WorkflowID |
+------------+
|        413 |
+------------+
1 row in set (0.056 sec)

如果我只是在原始查询“A”中替换整个子查询,那么查询 A 变为:

MariaDB [zestlTitan_3000037868]> 选择 tasks.TaskID,tasks.CardID,tasks.CritPath,tasks.ReworkCount,cs.WorkflowID,cs.StageCode,tasks.CurrentEscalationLevel,tasks.Title,tasks.Description,tasks.EscalationDelay,tasks.StartDate, tasks.EndDate,tasks.OriginalStartDate,tasks.OriginalEndDate,tasks.Priority,tasks.Duration,tasks.Status FROM Tasks 任务 INNER JOIN CardsSettings cs ON cs.CardID = tasks.TaskID INNER JOIN ( SELECT t1.WorkflowID FROM (SELECT WorkflowID from IsWFActive = "YES" and LastUpdatedDateTime BETWEEN "2018-11-21 23:59:59" and "2019-11-21 23:59:59" AND WorkflowTypeID = 9) t1 INNER JOIN ( SELECT "413" as WorkflowID ) t2 ON t1.WorkflowID = t2.WorkflowID) a ON a.WorkflowID = cs.WorkflowID;

这现在在 mariadb 上运行得非常快(~0.1 秒)。

编辑 3:

编辑 2 给了我一些关于为什么会发生这种情况的线索,显然子查询 t2 没有被先验计算(我可能错了)。所以我修改了查询以使用 WHERE t1.WorkflowID IN (...)t2 而不是 INNER JOIN,如下所示:

选择tasks.TaskID,tasks.CardID,tasks.CritPath,tasks.ReworkCount,cs.WorkflowID,cs.StageCode,tasks.CurrentEscalationLevel,tasks.Title,tasks.Description,tasks.EscalationDelay,tasks.StartDate,tasks.EndDate,tasks .OriginalStartDate,tasks.OriginalEndDate,tasks.Priority,tasks.Duration,tasks.Status FROM Tasks 任务 INNER JOIN CardsSettings cs ON cs.CardID = tasks.TaskID INNER JOIN ( SELECT t1.WorkflowID FROM (SELECT WorkflowID from Workflow Where IsWFActive = " YES" 和 LastUpdatedDateTime BETWEEN "2018-11-21 23:59:59" 和 "2019-11-21 23:59:59" AND WorkflowTypeID = 9) t1 WHERE t1.WorkflowID IN (从 CardsSettings 中选择 formSubCardSettings.WorkflowID 作为 formSubCardSettings INNER JOIN custom_form_submissions cfs ON cfs.FormSubmissionID = formSubCardSettings.CardID AND cfs.IsHistory = 'NO' INNER JOIN custom_form cf ON cf.FormID = cfs.FormID INNER JOIN custom_form_metadata cfm ON cfm.FormID = cf .FormID INNER JOIN custom_form_submissions_metadata cfsm ON cfsm.FormSubmissionID = cfs.FormSubmissionID AND cfsm.FormMetaID = cfm.FormMetaID INNER JOIN Workflow cfwf ON cfwf.WorkflowID = formSubCardSettings.WorkflowID WHERE cf.FormTitle = "Project Initiation" AND cfm.FieldLabel = "wid " AND cfwf.WorkflowTypeID = 9 AND (cfsm.Value IN("413")) ) a ON a.WorkflowID = cs.WorkflowID;

现在,mariadb 查询也可以快速运行,大约 0.2 秒。

虽然通过这次尝试和尝试,我已经设法使查询快速响应,但有人可以尝试从整件事中弄清楚吗?

标签: mysqlmariadbresponse-time

解决方案


当一行“太大”时,一些列的值被写入“非记录”存储。重新读取行 ( SELECTing) 需要从其他地方获取。如果数据没有缓存在 RAM 中,那么这是另一个磁盘命中。磁盘命中需要时间。

听起来表格有很多列?很多TEXT还是很大VARCHARs?这可以解释第一个查询的迟缓。

MySQL 和 MariaDB 的优化器存在分歧的地方之一是处理子查询。

额外的列还有两个可能的问题......

  • 它作为查询的第二个(?)表被提取,并且需要在查询的其余部分被拖拉。
  • 如果需要一个中间临时表(不能从 中看出EXPLAIN),它可能会被强制为 MyISAM,而不是 MEMORY。这是因为大VARCHAR(1024),即使实际数据很小。MyISAM 比 MEMORY 慢。(如果其他一些列是“大”,那么 MyISAM 已经是必需的;所以这个答案不适用。)

这个结构特别棘手,并且处理方式不同:

SELECT ...
    FROM ( SELECT ... )
    JOIN ( SELECT ... )

MySQL 经常具体化派生表并动态决定它需要什么索引并构建该索引。见<auto-key>EXPLAIN

MariaDB 以不同的方式处理子查询。

可能有一些索引会有所帮助。

        Where  IsWFActive = "YES"
          and  LastUpdatedDateTime BETWEEN "2018-11-21 23:59:59" AND "2019-11-21 23:59:59"
          AND  WorkflowTypeID = 9

需要

INDEX(IsWFActive, WorkflowTypeID,   -- in either order
      LastUpdatedDateTime)          -- after the others

和:

ON a.WorkflowID = cs.WorkflowID
ON cs.CardID = tasks.TaskID

需要这个,以便它可以从第二部分开始,然后回到第一个表:

CardsSettings:  INDEX(WorkflowID)
Tasks:          INDEX(TaskID)   -- unless that is the PRIMARY KEY

最后一个建议可能会消除我之前提到的“大柱子的拖拉”。

更多的

这应该使查询在所有(?)情况下运行得更快:

INDEX(WorkflowID,    -- first
      CardID, StageCode)   -- to make it covering (with or without stage

请注意,它会鼓励优化器首先放弃派生表(这通常是最佳的),然后移至csvia WorkflowID。最后,包括所需的其余列(“覆盖”)。

如需进一步讨论,请提供SHOW CREATE TABLE.


推荐阅读