2017-04-25

ClickHouse 是什么

ClickHouse 是一个开源的面向联机分析处理(OLAP, On-Line Analytical Processing) 的列式存储数据库管理系统。

在一个 "常规" 的行式数据库管理系统中,数据按下面的顺序存储

id |   name   | age
1  | Zhangsan | 18
2  | GlonHo   | 20
3  | Lisi     | 22
...| ...      | ...

换言之,所有相关的值在一个行里面一个挨一个存储。行式存储的的数据库管理系统有:MySQL, Postgres, MS SQL Server 等。


id: 1 2 3 ...
name: Zhangsan GlonHo Lisi ...
age: 18 20 22 ... 

列式存储的数据库管理系统更适合于 OLAP 场景(对于大多数查询,至少有 100 倍的处理速度提升)的原因有:

I/O 方面:

  • 对于一个分析的查询,只需要表中少量的列。在一个列存储数据库管理系统中,可以只读取所需的数据。例如,如果只需要从 100 列中读取 5 列,那么预期可以减少 20倍 I/O
  • 列式存储数据,更易于压缩,进一步减少 I/O
  • 由于减少了 I/O,系统中可以缓存更多符合要求的数据

CPU 方面:

执行一个查询需要处理大量的行,它有助于调度所有操作对整个向量而不是单独的行,或实现查询引擎,这样几乎没有调度成本,如果不这么做,对于任意还过得去的磁盘子系统,查询解释器不可避免地分摊 CPU。因此,把数据以列的方式来存储和处理是很有意义的。


  1. vector 引擎。所有操作是写成向量的形式,而不是单独的值。这意味着你不需要频繁调用操作,并且调度成本可以忽略不计。
  2. 代码生成。生成的查询的代码中含有所有的间接调用。

ClickHouse 独有特性



不是所有的列式存储数据管理系统都会进行数据压缩,如:InfiniDB CE 和 MonetDB。然而,数据压缩真的提高了性能


一些列式存储数据管理系统只能在 RAM(Random Access Memory)上面工作,如:SAP HANA 和 Google PowerDrill。但是对于海量数据,RAM 的成本太大了。



类 SQL 支持

支持非标准 SQL。
不支持 NULL,不支持相关子查询,支持 JOIN,支持在 FROM、IN、 JOIN 子句中的子查询和标量子查询。

Vector 引擎

不止以列的形式存储数据,部分列还经过向量处理。这样能取得高 CPU 性能。






这使得 ClickHouse 可以用作 Web 系统的后端。低延时意味着查询可以被及时处理。





使用多主节点复制。数据被写入任何可用的复制节点后,分发给其他的复制节点。系统在不同的复制节点中维护相同的数据。在出现失败之后,数据会自动回复,或者在复杂的情况下使用一个 "按钮"。



ClickHouse 不是一个跨平台的系统,它要求 Linux Ubuntu 12.04 或更新版本,支持带有 4.2 SSE 指令集的 x86_64 架构。

检查是否支持 4.2 SSE 指令集:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

推荐使用 Ubuntu 系统,连接终端必须是 UTF-8 编码(Ubuntu 默认是 UTF-8)。



/etc/apt/sources.list 或者一个单独的 /etc/apt/sources.list.d/clickhouse.list 文件中添加 repository:

在 Ubuntu Trusty (14.04):

deb http://repo.yandex.ru/clickhouse/trusty stable main

对于其他 Ubuntu 版本,把替换 trusty 成 xenial 或者 precise。


sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4    # optional
sudo apt-get update
sudo apt-get install clickhouse-client clickhouse-server-common





ClickHouse 包含访问限制设置,设置在 users.xml 文件中(和 config.xml 放在一起)。



Linux 平台跟着 build.md 中的介绍进行 build,Mac OS X 则跟着 build_osx.md 进行 build。


Client src/dbms/src/Client/
Server src/dbms/src/Server/

对于 Server,创建一个数据目录,如:


在 server config 中配置,然后给所需用户 chown 分配权限。



Docker image: https://hub.docker.com/r/yandex/clickhouse-server/

Gentoo: https://github.com/kmeaw/clickhouse-overlay



sudo service clickhouse-server start






clickhouse-server --config-file=/etc/clickhouse-server/config.xml

在这种情况下,日志会输出到控制台,这在开发的时候还是挺方便的。如果配置文件就在当前目录(即与 clickhouse-server 同一目录),无需使用参数 --config-file,默认读取 ./config-file。



clickhouse-client 参数介绍

参数 描述
--host, -h 目标服务器名,默认为 localhost
--port 目标端口,默认为 9000
--user, -u 连接用户,默认为 default
--password 连接用户密码,默认为空字符串
--query, -q 非交互模式下执行的命令
--database, -d 当前操作的数据库,默认选择配置文件配置的值(默认为 default 库)
--multiline, -m 如果设定,允许多行查询
--multiquery, -n 如果指定,允许处理由分号分隔的多个查询。只有在非交互式模式工作。
--format, -f 使用指定的默认格式输出结果
--vertical, -E 如果指定,默认使用垂直格式输出结果,等同于 --format=Vertical。在这种格式中,每个值可在单独的行上,显示宽表时很有用。
--time, -t 如果指定,在 stderr 中输出查询执行时间的非交互式模式下。
--stacktrace 如果指定,如果发生异常,也会输出堆栈跟踪。
--config-file 配置文件的名称,额外的设置或改变了上面列出的设置默认值。


  • ./clickhouse-client.xml
  • ~/.clickhouse-client/config.xml
  • /etc/clickhouse-client/config.xml



clickhouse-client --host=example.com

还可以指定将用于处理查询的任何设置,如:clickhouse-client --max_threads=1,表示查询处理线程的最大数量为 1。


root@GlonHo:~# clickhouse-client
ClickHouse client version 1.1.54198.
Connecting to localhost:9000.
Connected to ClickHouse server version 1.1.54198.

:) select 1


│ 1 │

1 rows in set. Elapsed: 0.023 sec. 


恭喜你,it works!


如果你是 Yandex 的员工,你可以使用 Yandex.Metrica 的测试数据来探索系统的功能和性能,你在这里可以找到如何使用测试数据的介绍。另外,你可以使用一个公开的可用的数据集,看这里


如果你是 Yandex 的员工,你可以使用 ClickHouse 内部邮件列表,你可以订阅这个列表来获取公告、新的发展信息和其他用户的问题。

另外,你可以在 Stack Overflow 上提问,在 Google Groups 上讨论,或者发邮件到开发者邮箱:clickhouse-feedback@yandex-team.com。


ClickHouse 目前官方只支持 Ubuntu,对于 RedHat 并没有什么描述。在 CentOS 6.9 上编译安装的时候,特别麻烦,最后放弃了,后来在官方的 Google Groups 上找到一个 RedHat 的包安装方式(或者直接到 GitHub),但还是找不到对应版本的依赖,搜了一下,可能需要重新编译内核,也就放弃了。最终找了个 Ubuntu 14.04 LTS 来做的实验。


[root@GlonHo ~]# grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
SSE 4.2 supported

[root@GlonHo ~]# yum-config-manager --add-repo http://repo.red-soft.biz/repos/clickhouse/repo/clickhouse-el6.repo
bash: yum-config-manager: command not found

[root@GlonHo ~]# yum -y install yum-utils 

[root@GlonHo ~]# yum-config-manager --add-repo http://repo.red-soft.biz/repos/clickhouse/repo/clickhouse-el6.repo
Loaded plugins: fastestmirror
adding repo from: http://repo.red-soft.biz/repos/clickhouse/repo/clickhouse-el6.repo
grabbing file http://repo.red-soft.biz/repos/clickhouse/repo/clickhouse-el6.repo to /etc/yum.repos.d/clickhouse-el6.repo
clickhouse-el6.repo                                                                                                                                  |  165 B     00:00     
repo saved to /etc/yum.repos.d/clickhouse-el6.repo

[root@GlonHo ~]# yum install clickhouse-server clickhouse-client clickhouse-server-common clickhouse-compressor -y
Loaded plugins: fastestmirror
Setting up Install Process
Loading mirror speeds from cached hostfile
 * base: mirrors.zju.edu.cn
 * epel: mirrors.tuna.tsinghua.edu.cn
 * extras: mirrors.zju.edu.cn
 * updates: mirrors.163.com
base                                                                                                                         | 3.7 kB     00:00     
clickhouse                                                                                                                   | 2.9 kB     00:02     
extras                                                                                                                       | 3.4 kB     00:00     
updates                                                                                                                      | 3.4 kB     00:00     
Package clickhouse-server-common-1.1.54198-3.el6.x86_64 already installed and latest version
Package clickhouse-compressor-1.1.54198-3.el6.x86_64 already installed and latest version
Resolving Dependencies
--> Running transaction check
---> Package clickhouse-client.x86_64 0:1.1.54198-3.el6 will be installed
---> Package clickhouse-server.x86_64 0:1.1.54198-3.el6 will be installed
--> Processing Dependency: libbfd- for package: clickhouse-server-1.1.54198-3.el6.x86_64
--> Finished Dependency Resolution
Error: Package: clickhouse-server-1.1.54198-3.el6.x86_64 (clickhouse)
           Requires: libbfd-
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest


root@GlonHo:~# grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
SSE 4.2 supported

root@GlonHo:/etc/apt/sources.list.d# vim clickhouse.list
deb http://repo.yandex.ru/clickhouse/trusty stable main

root@GlonHo:~# apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4
Executing: gpg --ignore-time-conflict --no-options --no-default-keyring --homedir /tmp/tmp.IoJhY8ePkd --no-auto-check-trustdb --trust-model always --keyring /etc/apt/trusted.gpg --primary-keyring /etc/apt/trusted.gpg --keyserver keyserver.ubuntu.com --recv E0C56BD4
gpg: requesting key E0C56BD4 from hkp server keyserver.ubuntu.com
gpg: key E0C56BD4: public key "ClickHouse Repository Key <milovidov@yandex-team.ru>" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
root@GlonHo:~# apt-get update
Hit http://security.ubuntu.com trusty-security InRelease                     
Ign http://archive.ubuntu.com trusty InRelease                                 
Reading package lists... Done

root@GlonHo:~# apt-get install clickhouse-client clickhouse-server-common
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  acl at-spi2-core colord dconf-gsettings-backend dconf-service fontconfig
  fontconfig-config fonts-dejavu-core hicolor-icon-theme libasound2
  libasound2-data libatk-bridge2.0-0 libatk1.0-0 libatk1.0-data libatspi2.0-0
  libavahi-client3 libavahi-common-data libavahi-common3 libcairo-gobject2
  libcairo2 libcanberra-gtk3-0 libcanberra-gtk3-module libcanberra0 libcolord1
  libcolorhug1 libcups2 libdatrie1 libdconf1 libdrm-intel1 libdrm-nouveau2
  libdrm-radeon1 libexif12 libfontconfig1 libfontenc1 libgd3
  libgdk-pixbuf2.0-0 libgdk-pixbuf2.0-common libgl1-mesa-dri libgl1-mesa-glx
  libglapi-mesa libgphoto2-6 libgphoto2-l10n libgphoto2-port10 libgraphite2-3
  libgtk-3-0 libgtk-3-bin libgtk-3-common libgudev-1.0-0 libgusb2
  libharfbuzz0b libice6 libieee1284-3 libjasper1 libjbig0 libjpeg-turbo8
  libjpeg8 liblcms2-2 libllvm3.4 libltdl7 libnotify-bin libnotify4 libogg0
  libpango-1.0-0 libpangocairo-1.0-0 libpangoft2-1.0-0 libpciaccess0
  libpixman-1-0 libsane libsane-common libsm6 libtdb1 libthai-data libthai0
  libtiff5 libtxc-dxtn-s2tc0 libv4l-0 libv4lconvert0 libvorbis0a
  libvorbisfile3 libvpx1 libwayland-client0 libwayland-cursor0 libx11-xcb1
  libxaw7 libxcb-dri2-0 libxcb-dri3-0 libxcb-glx0 libxcb-present0
  libxcb-render0 libxcb-shm0 libxcb-sync1 libxcomposite1 libxcursor1
  libxdamage1 libxfixes3 libxfont1 libxi6 libxinerama1 libxkbcommon0
  libxkbfile1 libxmu6 libxpm4 libxrandr2 libxrender1 libxshmfence1 libxt6
  libxtst6 libxxf86vm1 notification-daemon sound-theme-freedesktop x11-common
  x11-xkb-utils xfonts-base xfonts-encodings xfonts-utils xserver-common
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
The following NEW packages will be installed:
  clickhouse-client clickhouse-server-base clickhouse-server-common
0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 198 MB of archives.
After this operation, 632 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://repo.yandex.ru/clickhouse/trusty/ stable/main clickhouse-server-base amd64 1.1.54198 [198 MB]
Get:2 http://repo.yandex.ru/clickhouse/trusty/ stable/main clickhouse-client amd64 1.1.54198 [2,674 B]
Get:3 http://repo.yandex.ru/clickhouse/trusty/ stable/main clickhouse-server-common amd64 1.1.54198 [7,578 B]
Fetched 198 MB in 18min 7s (182 kB/s)                                          
Selecting previously unselected package clickhouse-server-base.
(Reading database ... 62852 files and directories currently installed.)
Preparing to unpack .../clickhouse-server-base_1.1.54198_amd64.deb ...
Unpacking clickhouse-server-base (1.1.54198) ...
Selecting previously unselected package clickhouse-client.
Preparing to unpack .../clickhouse-client_1.1.54198_amd64.deb ...
Unpacking clickhouse-client (1.1.54198) ...
Selecting previously unselected package clickhouse-server-common.
Preparing to unpack .../clickhouse-server-common_1.1.54198_amd64.deb ...
Unpacking clickhouse-server-common (1.1.54198) ...
Processing triggers for ureadahead (0.100.0-16) ...
Setting up clickhouse-server-base (1.1.54198) ...
Processing triggers for ureadahead (0.100.0-16) ...
Setting up clickhouse-client (1.1.54198) ...
Setting up clickhouse-server-common (1.1.54198) ...
root@GlonHo:~# top
Tasks:  90 total,   2 running,  88 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.2 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   2049856 total,  1147260 used,   902596 free,    16004 buffers
KiB Swap:        0 total,        0 used,        0 free.   960556 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                 
 1262 root      20   0  112712  34552   1112 S   0.0  1.7   0:00.00 ruby                                                                                                                    
 1196 root      20   0  182364  34428   2728 S   0.0  1.7   0:00.77 puppet                                                                                                                  
 **2368 clickho+  20   0  241336  12524   3504 S   0.0  0.6   0:00.04 clickhouse-serv**                                                                                                         
 1973 root      20   0  107720   4216   3220 S   0.0  0.2   0:00.02 sshd      

root@GlonHo:~# ls /var/log/clickhouse-server/
clickhouse-server.err.log  clickhouse-server.log  stderr  stdout

root@GlonHo:~# cat /var/log/clickhouse-server/clickhouse-server.err.log 
2017.04.24 09:30:03.783101 [ 1 ] <Warning> ConfigProcessor: Include not found: networks
2017.04.24 09:30:03.783125 [ 1 ] <Warning> ConfigProcessor: Include not found: networks
2017.04.24 09:30:05.803298 [ 2 ] <Warning> ConfigProcessor: Include not found: clickhouse_remote_servers
2017.04.24 09:30:05.803353 [ 2 ] <Warning> ConfigProcessor: Include not found: clickhouse_compression

root@GlonHo:~# cat /var/log/clickhouse-server/clickhouse-server.log 
2017.04.24 09:30:03.708578 [ 1 ] <Information> : Starting daemon with revision 54198
2017.04.24 09:30:03.781176 [ 1 ] <Information> Application: starting up
2017.04.24 09:30:03.781650 [ 1 ] <Debug> Application: rlimit on number of file descriptors is 262144
2017.04.24 09:30:03.781664 [ 1 ] <Debug> Application: Initializing DateLUT.
2017.04.24 09:30:03.781670 [ 1 ] <Trace> Application: Initialized DateLUT with time zone `UTC'.
2017.04.24 09:30:03.782226 [ 1 ] <Debug> Application: Configuration parameter 'interserver_http_host' doesn't exist or exists and empty. Will use 'ubuntu' as replica host.
2017.04.24 09:30:03.782338 [ 1 ] <Debug> ConfigReloader: Loading config `/etc/clickhouse-server/users.xml'
2017.04.24 09:30:03.783093 [ 1 ] <Warning> ConfigProcessor: Include not found: networks
2017.04.24 09:30:03.783121 [ 1 ] <Warning> ConfigProcessor: Include not found: networks
2017.04.24 09:30:03.783472 [ 1 ] <Information> Application: Loading metadata.
2017.04.24 09:30:03.783610 [ 1 ] <Information> DatabaseOrdinary (default): Total 0 tables.
2017.04.24 09:30:03.783734 [ 1 ] <Debug> Application: Loaded metadata.
2017.04.24 09:30:03.783848 [ 1 ] <Information> DatabaseOrdinary (system): Total 0 tables.
2017.04.24 09:30:03.784376 [ 1 ] <Information> Application: Listening http://[::1]:8123
2017.04.24 09:30:03.784420 [ 1 ] <Information> Application: Listening tcp: [::1]:9000
2017.04.24 09:30:03.784448 [ 1 ] <Information> Application: Listening interserver: [::1]:9009
2017.04.24 09:30:03.784473 [ 1 ] <Information> Application: Listening
2017.04.24 09:30:03.784491 [ 1 ] <Information> Application: Listening tcp:
2017.04.24 09:30:03.784507 [ 1 ] <Information> Application: Listening interserver:
2017.04.24 09:30:03.784621 [ 1 ] <Information> Application: Ready for connections.
2017.04.24 09:30:05.801608 [ 2 ] <Debug> ConfigReloader: Loading config `/etc/clickhouse-server/config.xml'
2017.04.24 09:30:05.803274 [ 2 ] <Warning> ConfigProcessor: Include not found: clickhouse_remote_servers
2017.04.24 09:30:05.803348 [ 2 ] <Warning> ConfigProcessor: Include not found: clickhouse_compression

root@GlonHo:~# cat /var/log/clickhouse-server/stderr 
Should logs to /var/log/clickhouse-server/clickhouse-server.log
Should error logs to /var/log/clickhouse-server/clickhouse-server.err.log

root@GlonHo:~# clickhouse-client
ClickHouse client version 1.1.54198.
Connecting to localhost:9000.
Connected to ClickHouse server version 1.1.54198.

:) select 1


│ 1 │

1 rows in set. Elapsed: 0.023 sec. 

:) select now()

SELECT now()

│ 2017-04-24 09:37:31 │

1 rows in set. Elapsed: 0.005 sec. 

:) Bye. (CTRL + d 退出客户端)

root@GlonHo:~# service clickhouse-server stop
Stop clickhouse-server service: DONE


root@GlonHo:~# tail -f /var/log/clickhouse-server/clickhouse-server.log
2017.04.24 09:36:59.286669 [ 3 ] <Trace> TCPConnectionFactory: TCP Request. Address:
2017.04.24 09:36:59.288258 [ 3 ] <Debug> TCPHandler: Connected ClickHouse client version 1.1.54198, user: default.
2017.04.24 09:37:15.669268 [ 3 ] <Debug> executeQuery: (from select 1
2017.04.24 09:37:15.678877 [ 3 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2017.04.24 09:37:15.679000 [ 3 ] <Debug> executeQuery: Query pipeline:

2017.04.24 09:37:15.679459 [ 3 ] <Information> executeQuery: Read 1 rows, 1.00 B in 0.010 sec., 98 rows/sec., 98.89 B/sec.
2017.04.24 09:37:15.679521 [ 3 ] <Debug> MemoryTracker: Peak memory usage (for query): 1.00 MiB.
2017.04.24 09:37:15.679541 [ 3 ] <Debug> MemoryTracker: Peak memory usage (for user): 1.00 MiB.
2017.04.24 09:37:15.679548 [ 3 ] <Debug> MemoryTracker: Peak memory usage (total): 1.00 MiB.
2017.04.24 09:37:15.679559 [ 3 ] <Information> TCPHandler: Processed in 0.011 sec.
2017.04.24 09:37:31.497405 [ 3 ] <Debug> executeQuery: (from select now()
2017.04.24 09:37:31.497653 [ 3 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2017.04.24 09:37:31.497976 [ 3 ] <Debug> executeQuery: Query pipeline:

2017.04.24 09:37:31.500776 [ 3 ] <Information> executeQuery: Read 1 rows, 1.00 B in 0.003 sec., 313 rows/sec., 313.78 B/sec.
2017.04.24 09:37:31.500856 [ 3 ] <Debug> MemoryTracker: Peak memory usage (for query): 1.00 MiB.
2017.04.24 09:37:31.500872 [ 3 ] <Debug> MemoryTracker: Peak memory usage (for user): 1.00 MiB.
2017.04.24 09:37:31.500880 [ 3 ] <Debug> MemoryTracker: Peak memory usage (total): 1.00 MiB.
2017.04.24 09:37:31.500893 [ 3 ] <Information> TCPHandler: Processed in 0.004 sec.

2017.04.24 10:04:11.313863 [ 3 ] <Information> TCPHandler: Done processing connection.
2017.04.24 10:04:36.359834 [ 4 ] <Information> Application: Received termination signal (Terminated)
2017.04.24 10:04:36.359978 [ 1 ] <Debug> Application: Received termination signal.
2017.04.24 10:04:36.360021 [ 1 ] <Debug> Application: Waiting for current connections to close.
2017.04.24 10:04:37.000043 [ 1 ] <Debug> Application: Closed all connections.
2017.04.24 10:04:37.004456 [ 1 ] <Information> Application: Shutting down storages.
2017.04.24 10:04:37.005499 [ 1 ] <Debug> Application: Shutted down storages.
2017.04.24 10:04:37.010125 [ 1 ] <Debug> Application: Destroyed global context.
2017.04.24 10:04:37.010483 [ 1 ] <Information> Application: shutting down
2017.04.24 10:04:37.011206 [ 1 ] <Debug> Application: Uninitializing subsystem: Logging Subsystem
2017.04.24 10:04:37.011572 [ 4 ] <Information> BaseDaemon: Stop SignalListener thread



