田昕峣 Richard

Clickhouse 3分片2副本集群搭建方法附基本原理概念和详细搭建步骤 (单节点多实例部署方法)

Clickhouse 3分片2副本集群搭建方法

Written By: Xinyao Tian
【原创文章码字不易转载请注明作者及出处】

概述

本文档描述了 Clickhouse 集群在三台物理主机上的 “3分片2副本” 的配置及搭建方法。

由于本搭建方案涉及 “1 台物理节点启动多个 Clickhouse 实例” 的情况，故需要进行各种目录权限、进程资源和运行时环境的拆分，较为复杂，
故单独撰写本份文档进行详细操作步骤的记录。

理论上，三台物理节点搭建的 Clickhouse 3 Shard 2 Replica 集群可以充分并行使用 3 台物理节点的计算资源，同时支持最多 1 台主机的宕机；
相应地，为了进行 Fault Tolerance，本方案会由于每个 Shard 配置了 2 个 Replica，故会额外消耗一倍的存储资源。

Clickhouse 术语及概念

根据 Clickhouse 官方文档，Replica / Shard 等相关概念的具体定义如下所示。

Replica

A copy of data. ClickHouse always has at least one copy of your data, and so the minimum number of replicas is one. This is an important detail, you may not be used to counting the original copy of your data as a replica, but that is the term used in ClickHouse code and documentation. Adding a second replica of your data provides fault tolerance.

Shard

A subset of data. ClickHouse always has at least one shard for your data, so if you do not split the data across multiple servers, your data will be stored in one shard. Sharding data across multiple servers can be used to divide the load if you exceed the capacity of a single server. The destination server is determined by the sharding key, and is defined when you create the distributed table. The sharding key can be random or as an output of a hash function. The deployment examples involving sharding will use rand() as the sharding key, and will provide further information on when and how to choose a different sharding key.

Distributed coordination

ClickHouse Keeper provides the coordination system for data replication and distributed DDL queries execution. ClickHouse Keeper is compatible with Apache ZooKeeper.

Clickhouse 3-Shard-2-Replica 搭建方法

注意，为突出重点，本例中仅针对关键配置项进行记录，其他 Clickhouse 常规配置 (e.g. Zookeeper 相关) 没有进行记录，但仍需谨慎配置。

配置文件主要配置项

创建两个配置文件

由于我们需要在同一台物理节点上启动多个 Clickhouse instance，故需要不同的配置文件对配置多个 instance。

此处，我们复制 Clickhouse 默认的配置文件 config.xml 并创建额外的 clickhouse 配置文件 config-9100.xml

[root@p0-lpsm-rf1 clickhouse-server]# ls -l | grep config
-rw-r--r-- 1 clickhouse clickhouse 58471 Jun 27 15:51 config-9100.xml
drwxr-xr-x 2 clickhouse clickhouse    24 Jun 19 17:36 config.d
-rw-r--r-- 1 clickhouse clickhouse 59784 Jun 27 15:41 config.xml

修改配置文件 remote_servers

Clickhouse “3 分片 2 副本” 的部署模式可以在保障高效查询的同时，开启错误容忍的功能: 按如下配置，3 台主机中挂掉最多 1 台仍可以保障服务的正常运行。

下面为修改分片和副本的配置方式，配置方法如下所示。需要注意每个 replica 的端口号不能冲突，否则 Clickhouse 会出现 Error 报警。

该配置项为集群级配置，故所有进程的配置文件该项均相同。在 config.xml 和 config-9000.xml 中均进行此配置:


     
        
            
            1
            
            false
            
                
                1
                p0-lpsm-rf1
                9000
            
            
                2
                p0-lpsm-rf2
                9100
            
        
        
            1
            false
            
                1
                p0-lpsm-rf2
                9000
            
            
                2
                p0-lpsm-rf3
                9100
            
        
        
            1
            false
            
                1
                p0-lpsm-rf3
                9000
            
            
                2
                p0-lpsm-rf1
                9100

修改进程相应端口号

config.xml 使用 Clickhouse 默认的 9000 端口

配置第一个 Clickhouse 实例并使用如下端口 (9000 / 8123 / 9009 端口):

config-9100.xml 使用 9100 端口

配置第二个 Clickhouse 实例并使用如下端口 (9100 / 8124 / 9010 端口):

修改配置文件的数据目录和日志存储位置

创建路径

创建 clickhouse 底层的数据目录，以隔离两个 Clickhouse 实例的存储空间:

[root@p0-lpsm-rf1 clickhouse]# mkdir /data/clickhouse-9100
[root@p0-lpsm-rf1 clickhouse]# chown -R clickhouse:clickhouse /data/clickhouse-9100

创建 Clickhouse 日志文件存放位置:

[root@p0-lpsm-rf1 clickhouse-server]# mkdir /data/clickhouse-9100/logs
[root@p0-lpsm-rf1 clickhouse-server]# chown -R clickhouse:clickhouse /data/clickhouse-9100/logs/

修改 clickhouse 的配置项

修改 config.xml 的各项配置

修改配置文件中的数据目录位置:

 
 /data/clickhouse-9000

 
 /data/clickhouse-9000/tmp/

 
 /data/clickhouse-9000/user_files/

修改配置文件中的日志位置:


    information
    /data/clickhouse-9000/logs/clickhouse-server.log
    /data/clickhouse-9000/logs/clickhouse-server.err.log
    100M
    10

修改 config-9100.xml 的各项配置

修改配置文件中的数据目录位置:

 
 /data/clickhouse-9100

 
 /data/clickhouse-9100/tmp/

 
 /data/clickhouse-9100/user_files/

修改配置文件中的日志位置:


    information
    /data/clickhouse-9100/logs/clickhouse-server.log
    /data/clickhouse-9100/logs/clickhouse-server.err.log
    100M
    10

修改 macros 相关配置项

该配置项在 3 个节点上的共计 6 个实例均互不相同，故需要细心配置，否则会出现数据错乱导致的严重异常。

p0-lpsm-rf1

p0-lpsm-rf1:9000

    
        01
        01
        production_cluster_3s2r

p0-lpsm-rf1:9100

    
        03
        02
        production_cluster_3s2r

p0-lpsm-rf2

p0-lpsm-rf2:9000

    
        02
        01
        production_cluster_3s2r

p0-lpsm-rf2:9100

    
        01
        02
        production_cluster_3s2r

p0-lpsm-rf3

p0-lpsm-rf3:9000

    
        03
        01
        production_cluster_3s2r

p0-lpsm-rf3:9100

    
        02
        02
        production_cluster_3s2r

配置小结

至此，共计 6 个 Clickhouse 实例相应的配置文件就已经基本配置完毕。

启动 Clikhouse-server

由于我们在一台主机上将启动 2 个 Clickhouse 实例，故对启动命令也需要进行相应的区分。

修改 Clickhouse 启动脚本

进入 system 启动脚本目录并创建新的启动脚本 cp /etc/systemd/system/clickhouse-server.service /etc/systemd/system/clickhouse-server-9100.service:

[root@p0-lpsm-rf2 system]# pwd
/etc/systemd/system
[root@p0-lpsm-rf2 system]# ls -l | grep clickhouse-server
-rw-r--r--  1 root root  965 Jun 27 15:06 clickhouse-server-9100.service
-rw-r--r--  1 root root  950 Jun 19 15:15 clickhouse-server.service

修改新的启动脚本 /etc/systemd/system/clickhouse-server-9100.service 的内容。注意其中的 ExecStart 和 EnvironmentFile 的变量内容已经被修改为了 9100 相关的配置。

[root@p0-lpsm-rf2 system]# cat /etc/systemd/system/clickhouse-server-9100.service
[Unit]
Description=ClickHouse Server (analytic DBMS for big data)
Requires=network-online.target
# NOTE: that After/Wants=time-sync.target is not enough, you need to ensure
# that the time was adjusted already, if you use systemd-timesyncd you are
# safe, but if you use ntp or some other daemon, you should configure it
# additionaly.
After=time-sync.target network-online.target
Wants=time-sync.target

[Service]
Type=simple
User=clickhouse
Group=clickhouse
Restart=always
RestartSec=30
RuntimeDirectory=clickhouse-server
ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config-9100.xml --pid-file=/run/clickhouse-server/clickhouse-server-9100.pid
# Minus means that this file is optional.
EnvironmentFile=-/etc/default/clickhouse-9100
LimitCORE=infinity
LimitNOFILE=500000
CapabilityBoundingSet=CAP_NET_ADMIN CAP_IPC_LOCK CAP_SYS_NICE

[Install]
# ClickHouse should not start from the rescue shell (rescue.target).
WantedBy=multi-user.target

启动 Clikhouse-server 进程

以进程的方式分别启动两个 Clickhouse instances:

# 启动 Clickhouse 9000 instance
sudo systemctl stop clickhouse-server
sudo systemctl start clickhouse-server
sudo systemctl restart clickhouse-server

# 启动 Clickhouse 9100 instance
sudo systemctl stop clickhouse-server-9100
sudo systemctl start clickhouse-server-9100
sudo systemctl restart clickhouse-server-9100

检视 Clickhouse instances 启动情况

检视两个实例对应的进程是否全部启动:

[root@p0-lpsm-rf1 clickhouse-server]# ps -ef | grep clickhouse
clickho+  58120      1  0 15:43 ?        00:00:00 clickhouse-watchdog        --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid
clickho+  58123  58120  2 15:43 ?        00:01:32 /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid
clickho+  64780      1  0 15:52 ?        00:00:00 clickhouse-watchdog        --config=/etc/clickhouse-server/config-9100.xml --pid-file=/run/clickhouse-server/clickhouse-server-9100.pid
clickho+  64783  64780  1 15:52 ?        00:00:50 /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config-9100.xml --pid-file=/run/clickhouse-server/clickhouse-server-9100.pid
root     144479   7356  0 16:45 pts/0    00:00:00 grep --color=auto clickhouse

检视端口是否启动正常:

[root@p0-lpsm-rf1 clickhouse-server]# netstat -ntlp | grep clickhouse
tcp        0      0 0.0.0.0:9009            0.0.0.0:*               LISTEN      58123/clickhouse-se 
tcp        0      0 127.0.0.1:9010          0.0.0.0:*               LISTEN      64783/clickhouse-se 
tcp        0      0 0.0.0.0:8123            0.0.0.0:*               LISTEN      58123/clickhouse-se 
tcp        0      0 127.0.0.1:8124          0.0.0.0:*               LISTEN      64783/clickhouse-se 
tcp        0      0 0.0.0.0:9000            0.0.0.0:*               LISTEN      58123/clickhouse-se 
tcp        0      0 127.0.0.1:9100          0.0.0.0:*               LISTEN      64783/clickhouse-se

Ps: 如果启动过程中出现问题请检视日志文件: /data/clickhouse/logs/clickhouse-server.err.log 以及 /data/clickhouse-9100/logs/clickhouse-server.err.log。

通过官方给定案例检验 Clickhouse 的部署情况

测试用例: 官方给出的简易测试用例

Verify ClickHouse cluster functionality, followed by the official tutorial: Verify ClickHouse cluster functionality

Create a database on the cluster configured above

CREATE DATABASE db_test ON CLUSTER production_cluster_3s2r
DROP TABLE IF EXISTS db_test.test_table01 ON CLUSTER production_cluster_3s2r

Create a table on the database using the ReplicatedMergeTree table engine

CREATE TABLE db_test.test_table01 ON CLUSTER production_cluster_3s2r
(
    `id` UInt64,
    `column1` String
)
ENGINE = ReplicatedMergeTree
ORDER BY id

Insert data on one node and query it on p0-lpsm-rf1

INSERT INTO db_test.test_table01 (id, column1) VALUES (1, 'abc');

Query the table on the node p0-lpsm-rf2

SELECT * FROM db_test.test_table01

Insert data on the other node and query it on the node p0-lpsm-rf1

INSERT INTO db_test.test_table01 (id, column1) VALUES (2, 'def');

Stop one ClickHouse server node Stop one of the ClickHouse server nodes by running an operating system command similar to the command used to start the node. If you used systemctl start to start the node, then use systemctl stop to stop it.
Insert more data on the running node

INSERT INTO db_test.test_table01 (id, column1) VALUES (3, 'ghi');

Restart the stopped node and select from there also

SELECT * FROM db_test.test_table01

测试用例: 数据备份 Replica 测试

可以看到在 p0-lpsm-rf2:9000 插入数据后，由于设置了 p0-lpsm-rf3:9100 为其副本，故登录 p0-lpsm-rf3:9100 即可直接查看到插入的内容，
者提供了良好的错误容忍机制。

[root@p0-lpsm-rf2 clickhouse-server]# clickhouse-client --port=9000
ClickHouse client version 22.3.2.1.
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 22.3.2 revision 54455.
p0-lpsm-rf2 :) INSERT INTO db_test.test_table01 (id, column1) VALUES (1, 'abc');

INSERT INTO db_test.test_table01 (id, column1) FORMAT Values

Query id: c731d404-ff01-43eb-912e-e2671e8dbafb

Ok.

1 rows in set. Elapsed: 0.022 sec. 

p0-lpsm-rf2 :) SELECT * FROM db_test.test_table01

SELECT *
FROM db_test.test_table01

Query id: b74b8f78-5b97-4eb9-956d-27f6fc556f3d

┌─id─┬─column1─┐
│  1 │ abc     │
└────┴─────────┘

[root@p0-lpsm-rf3 clickhouse-server]# clickhouse-client --port=9100
ClickHouse client version 22.3.2.1.
Connecting to localhost:9100 as user default.
Connected to ClickHouse server version 22.3.2 revision 54455.

p0-lpsm-rf3 :) SELECT * FROM db_test.test_table01

SELECT *
FROM db_test.test_table01

Query id: 1fbf3994-b4fe-44d7-99f6-26aa2ead5e5a

┌─id─┬─column1─┐
│  1 │ abc     │
└────┴─────────┘

1 rows in set. Elapsed: 0.001 sec.

总结