首页 > 解决方案 > h2 如何在 Join 中选择正确/错误的索引

问题描述

在 Java 中遇到了命名查询的问题,但问题是问题出在 H2 中。

我认为ANALYZE这是我的解决方案,并且会解决我的问题。它在我的开发机器上本地完成。在客户端,它确实使情况变得更糟。

场景:我有一个数据版本为 105 的 H2 数据库。导入更多数据后,它变为版本 106。

桌子看起来像 在此处输入图像描述

查询(获取具有给定 guid、本地和最高版本的行):

SELECT tdo.TECDOC_GUID as guid, tdo.TECDOC_LOCALE as locale , tdo.TECDOC_VERSION as version, tdo.DATA as data
FROM TECDOC_OBJECTS tdo
LEFT OUTER JOIN TECDOC_OBJECTS tdo1
ON (
    tdo.TECDOC_GUID = tdo1.TECDOC_GUID AND 
    tdo.TECDOC_LOCALE = tdo1.TECDOC_LOCALE AND 
    tdo.TECDOC_VERSION < tdo1.TECDOC_VERSION)
WHERE tdo1.id IS NULL 
AND tdo.TECDOC_GUID in ('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
AND tdo.TECDOC_LOCALE = 'de';

在我运行ANALYZE命令之前执行计划(scanCount 非常低):

SELECT
    TDO.TECDOC_GUID AS GUID,
    TDO.TECDOC_LOCALE AS LOCALE,
    TDO.TECDOC_VERSION AS VERSION,
    TDO.DATA AS DATA
FROM PUBLIC.TECDOC_OBJECTS TDO
    /* PUBLIC.IDX_TECDOC_GUID: TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0') */
    /* WHERE (TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
        AND (TDO.TECDOC_LOCALE = 'de')
    */
    /* scanCount: 19 */
LEFT OUTER JOIN PUBLIC.TECDOC_OBJECTS TDO1
    /* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_GUID = TDO.TECDOC_GUID
        AND TECDOC_LOCALE = TDO.TECDOC_LOCALE
        AND TECDOC_VERSION > TDO.TECDOC_VERSION
     */
    ON (TDO.TECDOC_VERSION < TDO1.TECDOC_VERSION)
    AND ((TDO.TECDOC_GUID = TDO1.TECDOC_GUID)
    AND (TDO.TECDOC_LOCALE = TDO1.TECDOC_LOCALE))
    /* scanCount: 4 */
WHERE (TDO.TECDOC_LOCALE = 'de')
    AND ((TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
    AND (TDO1.ID IS NULL))
/*
total: 37
TECDOC_OBJECTS.IDX_GUID_LOCALE_VERSION read: 6 (16%)
TECDOC_OBJECTS.IDX_TECDOC_GUID read: 8 (21%)
TECDOC_OBJECTS.TECDOC_OBJECTS_DATA read: 23 (62%)
*/

SELECT
    TDO.TECDOC_GUID AS GUID,
    TDO.TECDOC_LOCALE AS LOCALE,
    TDO.TECDOC_VERSION AS VERSION,
    TDO.DATA AS DATA
FROM PUBLIC.TECDOC_OBJECTS TDO
    /* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_LOCALE = 'de'
        AND TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
     */
    /* WHERE (TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
        AND (TDO.TECDOC_LOCALE = 'de')
    */
    /* scanCount: 287385 */
LEFT OUTER JOIN PUBLIC.TECDOC_OBJECTS TDO1
    /* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_GUID = TDO.TECDOC_GUID
        AND TECDOC_LOCALE = TDO.TECDOC_LOCALE
        AND TECDOC_VERSION > TDO.TECDOC_VERSION
     */
    ON (TDO.TECDOC_VERSION < TDO1.TECDOC_VERSION)
    AND ((TDO.TECDOC_GUID = TDO1.TECDOC_GUID)
    AND (TDO.TECDOC_LOCALE = TDO1.TECDOC_LOCALE))
    /* scanCount: 4 */
WHERE (TDO.TECDOC_LOCALE = 'de')
    AND ((TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
    AND (TDO1.ID IS NULL))
/*
total: 11891
TECDOC_OBJECTS.IDX_GUID_LOCALE_VERSION read: 11884 (99%)
TECDOC_OBJECTS.TECDOC_OBJECTS_DATA read: 7 (0%)
*/

在我运行ANALYZE命令后执行计划(scanCount 真的很高):

SELECT
    TDO.TECDOC_GUID AS GUID,
    TDO.TECDOC_LOCALE AS LOCALE,
    TDO.TECDOC_VERSION AS VERSION,
    TDO.DATA AS DATA
FROM PUBLIC.TECDOC_OBJECTS TDO
    /* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_LOCALE = 'de'
        AND TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
     */
    /* WHERE (TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
        AND (TDO.TECDOC_LOCALE = 'de')
    */
    /* scanCount: 287385 */
LEFT OUTER JOIN PUBLIC.TECDOC_OBJECTS TDO1
    /* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_GUID = TDO.TECDOC_GUID
        AND TECDOC_LOCALE = TDO.TECDOC_LOCALE
        AND TECDOC_VERSION > TDO.TECDOC_VERSION
     */
    ON (TDO.TECDOC_VERSION < TDO1.TECDOC_VERSION)
    AND ((TDO.TECDOC_GUID = TDO1.TECDOC_GUID)
    AND (TDO.TECDOC_LOCALE = TDO1.TECDOC_LOCALE))
    /* scanCount: 4 */
WHERE (TDO.TECDOC_LOCALE = 'de')
    AND ((TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
    AND (TDO1.ID IS NULL))
/*
total: 11891
TECDOC_OBJECTS.IDX_GUID_LOCALE_VERSION read: 11884 (99%)
TECDOC_OBJECTS.TECDOC_OBJECTS_DATA read: 7 (0%)
*/

但是在我的开发者笔记本上,ANALYZE查询后还是很快的。不知何故,H2 使用了错误的索引(因为根据文档,它每次连接只能使用一个索引)。

有人有什么建议吗?

标签: performancejoinh2

解决方案


您的查询并不复杂。我认为它的关键方面是where条件。

WHERE tdo1.id IS NULL 
  AND tdo.TECDOC_GUID in ('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6',
    'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
  AND tdo.TECDOC_LOCALE = 'de';

由于某种原因,H2 以错误的方式使用索引。我会尝试重新表述这个条件,看看 H2 的 SQL 优化器是如何工作的。

例如,您可以尝试选项 #1

SELECT
    ... -- columns, FROM, and OUTER JOIN here
  WHERE tdo.TECDOC_GUID = 'GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6'
    AND tdo.TECDOC_LOCALE = 'de'
     OR tdo.TECDOC_GUID = 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'
    AND tdo.TECDOC_LOCALE = 'de'
    AND tdo1.id IS NULL 

或者您可以将查询一分为二,以确保它使用索引,如选项 #2中所示:

SELECT
    ... -- columns, FROM, and OUTER JOIN here
  WHERE tdo.TECDOC_GUID = 'GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6'
    AND tdo.TECDOC_LOCALE = 'de'
    AND tdo1.id IS NULL 
UNION ALL
SELECT
    ... -- columns, FROM, and OUTER JOIN here
  WHERE tdo.TECDOC_GUID = 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'
    AND tdo.TECDOC_LOCALE = 'de'
    AND tdo1.id IS NULL 

这样,您仅在搜索时使用相等性。这对于 SQL 优化器来说更容易理解。请注意,使用union all它比union.


推荐阅读