performance - Clickhouse:值为 0 1 的列上的索引
问题描述
我正在尝试提高在 UInt8 列上包含 WHERE 子句的查询的性能,该子句仅包含 0 或 1 作为可能的值。我试图分解问题以确保没有其他因素(分区、PK..)导致问题。我创建了一个简单的表index_text,只有 1 列和一组像这样的索引:
CREATE TABLE default.index_text (
`columnX` UInt8,
INDEX indexX1 columnX TYPE minmax GRANULARITY 1,
INDEX indexX2 columnX TYPE
set(0) GRANULARITY 1,
INDEX indexX3 columnX TYPE
set(1) GRANULARITY 1
) ENGINE = MergeTree()
ORDER BY
tuple() SETTINGS index_granularity = 8192
之后,我用大约 250 万个随机值(0 或 1)填充表。我希望 indizes 在此查询中删除颗粒,但事实并非如此:
SELECT COUNT(*) FROM index_text WHERE columnX = 0
SELECT COUNT(*)
FROM index_text
WHERE columnX = 0
[JWDebian] 2020.10.19 07:48:26.511085 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Debug> executeQuery: (from [::1]:40088) SELECT COUNT(*) FROM index_text WHERE columnX = 0
[JWDebian] 2020.10.19 07:48:26.511384 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> ContextAccess (default): Access granted: SELECT(columnX) ON default.index_text
[JWDebian] 2020.10.19 07:48:26.511440 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Debug> default.index_text (SelectExecutor): Key condition: unknown
[JWDebian] 2020.10.19 07:48:26.512611 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Debug> default.index_text (SelectExecutor): Index `indexX1` has dropped 0 / 3050 granules.
[JWDebian] 2020.10.19 07:48:26.522601 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Debug> default.index_text (SelectExecutor): Index `indexX2` has dropped 0 / 3050 granules.
[JWDebian] 2020.10.19 07:48:26.523699 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Debug> default.index_text (SelectExecutor): Index `indexX3` has dropped 0 / 3050 granules.
[JWDebian] 2020.10.19 07:48:26.523722 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Debug> default.index_text (SelectExecutor): Selected 1 parts by date, 1 parts by key, 3050 marks by primary key, 3050 marks to read from 1 ranges
[JWDebian] 2020.10.19 07:48:26.523764 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> default.index_text (SelectExecutor): Reading approx. 24985600 rows with 2 streams
[JWDebian] 2020.10.19 07:48:26.523823 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
[JWDebian] 2020.10.19 07:48:26.525061 [ 620 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> AggregatingTransform: Aggregating
[JWDebian] 2020.10.19 07:48:26.525087 [ 620 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> Aggregator: Aggregation method: without_key
[JWDebian] 2020.10.19 07:48:26.530850 [ 621 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> AggregatingTransform: Aggregating
[JWDebian] 2020.10.19 07:48:26.530893 [ 621 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> Aggregator: Aggregation method: without_key
[JWDebian] 2020.10.19 07:48:26.598438 [ 620 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> AggregatingTransform: Aggregated. 6509826 to 1 rows (from 6.21 MiB) in 0.074525217 sec. (87350648.03635526 rows/sec., 83.30 MiB/sec.)
[JWDebian] 2020.10.19 07:48:26.598976 [ 621 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> AggregatingTransform: Aggregated. 6109074 to 1 rows (from 5.83 MiB) in 0.075064427 sec. (81384408.62274216 rows/sec., 77.61 MiB/sec.)
[JWDebian] 2020.10.19 07:48:26.598994 [ 621 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Trace> Aggregator: Merging aggregated data
┌──COUNT()─┐
│ 12618900 │
└──────────┘
[JWDebian] 2020.10.19 07:48:26.599322 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Information> executeQuery: Read 24979658 rows, 23.82 MiB in 0.088181578 sec., 283275243 rows/sec., 270.15 MiB/sec.
[JWDebian] 2020.10.19 07:48:26.599356 [ 584 ] {af7615f0-f32b-47c5-87a2-e8acc8e27f5e} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
我在这里做错了什么?对INDEX的概念误解?INDEX 的类型/参数错误?我正在使用 ClickHouse 服务器版本 20.9.2 修订版 54439,所以我猜allow_experimental_data_skipping_indices设置不再重要。无奈之下,我将其设置为1并在填充后查询了一个O PTIMIZE TABLE index_text FINAL,但结果是一样的。
解决方案
推荐阅读
- arrays - 如何通过接收 CSV 数据在构造函数中推送和移动值
- python-3.x - 在python中上传更大的文件时令牌过期
- flutter - 如何使用绝对路径打开文件?
- python - IntelliJ Python 插件 pytest 代码运行但调试不起作用
- r - 连续地块编号
- json - 构造函数中的 Flutter JSON 解析
- typescript - Vue 3 - 使用带有 Typescript 支持的附加道具扩展 Vue 组件
- javascript - NestJS:身份验证保护流程
- php - 在 shopify 中创建但在事件发生时未触发的 webhook
- uwp - 从 UIElement 中提取 ContainerVisual