本文详细描述了如何搭建 Clickhouse 单机版及 Clickhouse 集群。
本小节简述了如何搭建 Clickhouse 的单机版。该部分内容也会成为后续搭建 Clickhouse 集群的先决条件,请务必认真对待。
将下载好的 clickhouse 安装包利用命令分发至待安装主机上。安装包可以到 Clickhouse 官网 下载至本地。
[hadoop@p0-tklcdh-clickhouse01 software]$ pwd
/home/hadoop/software
[hadoop@p0-tklcdh-clickhouse01 software]$ ls -l
total 1112624
-rw-r--r-- 1 hadoop hadoop 39866 Sep 28 13:57 clickhouse-client-22.3.2.2.tgz
-rw-r--r-- 1 hadoop hadoop 179545907 Sep 28 13:57 clickhouse-common-static-22.3.2.2.tgz
-rw-r--r-- 1 hadoop hadoop 774024509 Sep 28 13:57 clickhouse-common-static-dbg-22.3.2.2.tgz
-rw-r--r-- 1 hadoop hadoop 61901 Sep 28 13:57 clickhouse-server-22.3.2.2.tgz
-rw-r--r-- 1 hadoop hadoop 185646832 Sep 28 11:40 jdk-8u181-linux-x64.tar.gz
[hadoop@p0-tklcdh-clickhouse01 software]$ scp clickhouse-* [email protected]:`pwd`
[hadoop@p0-tklcdh-clickhouse01 software]$ scp clickhouse-* [email protected]:`pwd
在 /homehadoop/software
中创建一个新脚本 sudo vim /homehadoop/software/install.sh
并将下面的内容拷贝至脚本。
也可以参考官方网站的 Installation Guide 进行安装。
#!/bin/bash
# Remember to switch Linux user to root
sudo mkdir /home/hadoop/clickhouse
sudo cp /home/hadoop/software/clickhouse-* /home/hadoop/clickhouse
cd /home/hadoop/clickhouse
tar -xzvf "clickhouse-common-static-22.3.2.2.tgz"
tar -xzvf "clickhouse-common-static-dbg-22.3.2.2.tgz"
tar -xzvf "clickhouse-server-22.3.2.2.tgz"
tar -xzvf "clickhouse-client-22.3.2.2.tgz"
rm -f /home/hadoop/clickhouse/*.tgz
sudo "clickhouse-common-static-22.3.2.2/install/doinst.sh"
sudo "clickhouse-common-static-dbg-22.3.2.2/install/doinst.sh"
sudo "clickhouse-server-22.3.2.2/install/doinst.sh"
sudo "clickhouse-client-22.3.2.2/install/doinst.sh"
# Remember to generate a openssl certification
sudo openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 3650 -nodes -x509 -keyout /etc/clickhouse-servee-server/server.crt
# Set config to ensure only IPv4 is usable
sudo echo "0.0.0.0 " > /etc/clickhouse-server/config.d/listen.xml
拷贝完毕后,更改脚本权限 chmod 777 sudo vim /homehadoop/software/install.sh
并执行脚本 /homehadoop/software/install.sh
。根据提示输入相应的指令并完成安装。
使用如下命令启动 clickhouse server:
sudo clickhouse start
在 server 成功启动后,使用 clickhouse-client 配合在安装时指定的 password 进行登录
clickhouse-client --password
如观测到如下输出则可表明 Clickhouse 在单机上运行成功
[root@p0-tklcdh-clickhouse03 software]# clickhouse-client --password
ClickHouse client version 22.3.2.1.
Password for user (default):
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 22.3.2 revision 54455.
p0-tklcdh-clickhouse01 :) show databases;
SHOW DATABASES
Query id: d9a6c3f9-1aa8-42fd-92e4-e29de6d32dd4
┌─name───────────────┐
│ INFORMATION_SCHEMA │
│ default │
│ information_schema │
│ system │
└────────────────────┘
4 rows in set. Elapsed: 0.002 sec.
p0-tklcdh-clickhouse03 :) exit
Bye.
本小节在上一个小节的基础上配合 Zookeeper 开始搭建一个 Clickhouse 集群。
/etc/hosts
中各个 Clickhouse 节点和 Zookeeper 节点的 hostname-IP 映射全部配置完毕,并每台机器可以相互 ping 通,相应端口不要被占用/etc/clickhouse-server/config.xml
完成配置后,将配置文件全部替换至各个 Clickhouse 节点的相应位置Clickhouse 官方文档给出了配置分布式集群的方法。详见 Clickhouse Clusters
为了能够使 Clickhouse 以集群的方式顺利运行,需要在 /etc/clickhouse-server/config.xml
配置文件的相应位置替换如下配置项:
remote_servers
的相关配置。特别注意,配置 remote_servers
时需要确保每台节点上的 /etc/hosts
中都有相应的域名-IP映射关系,否则节点之间无法互通。 <remote_servers>
<example_cluster>
<shard>
<weight>1weight>
<internal_replication>falseinternal_replication>
<replica>
<host>p0-tklcdh-clickhouse01host>
<port>9000port>
replica>
shard>
<shard>
<weight>1weight>
<internal_replication>falseinternal_replication>
<replica>
<host>p0-tklcdh-clickhouse02host>
<port>9000port>
replica>
shard>
<shard>
<weight>1weight>
<internal_replication>falseinternal_replication>
<replica>
<host>p0-tklcdh-clickhouse03host>
<port>9000port>
replica>
shard>
example_cluster>
remote_servers>
zookeeper
的相关配置。特别注意,配置 zookeeper
时需要确保每台节点上的 /etc/hosts
中都有相应的域名-IP映射关系,否则该节点无法找到 Zookeeper 服务,也就无法获取从其他节点传来的任务指令。详见 Data Replication 和 Clickhouse 官方 Zookeeper 配置指南 <zookeeper>
<node>
<host>p0-tklfbirp-cdh-zk01host>
<port>2181port>
node>
<node>
<host>p0-tklfbirp-cdh-zk02host>
<port>2181port>
node>
<node>
<host>p0-tklfbirp-cdh-zk03host>
<port>2181port>
node>
<node>
<host>p0-tklfbirp-cdh-zk04host>
<port>2181port>
node>
<node>
<host>p0-tklfbirp-cdh-zk05host>
<port>2181port>
node>
zookeeper>
macros
的相关配置。具体作用请见官方文档说明 macros 和 Replicated*MergeTree parameters <macros>
<shard>01shard>
<replica>p0-tklcdh-clickhouse01replica>
macros>
<path>/data02/clickhouse/path>
<tmp_path>/data02/clickhouse/tmp/tmp_path>
<user_files_path>/data02/clickhouse/user_files/user_files_path>
使用 scp 命令或将添加完集群配置的配置文件 /etc/clickhouse-server/config.xml
手动拷贝至所有 Clickhouse 节点的相应位置。
使用如下命令在各个 Clickhouse 节点上重新启动修改完配置文件的 clickhouse server:
sudo clickhouse restart
在 Clickhouse Cmd 中使用相关命令查看元数据的相应信息。
p0-tklcdh-clickhouse01 :) select * from system.clusters
SELECT *
FROM system.clusters
Query id: 8fac3aac-c543-4fae-97b2-58b278089675
┌─cluster─────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name──────────────┬─host_address──┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ example_cluster │ 1 │ 1 │ 1 │ p0-tklcdh-clickhouse01 │ 10.129.81.113 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ example_cluster │ 2 │ 1 │ 1 │ p0-tklcdh-clickhouse02 │ 10.129.81.114 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ example_cluster │ 3 │ 1 │ 1 │ p0-tklcdh-clickhouse03 │ 10.129.81.115 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
└─────────────────┴───────────┴──────────────┴─────────────┴────────────────────────┴───────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
3 rows in set. Elapsed: 0.002 sec.
使用如下命令连接至 clickhouse 的主节点,并运行一个分布式建表语句检查 Clickhouse 的分布式搭建是否成功。详见 Creating Replicated Tables
建表语句如下:
CREATE TABLE IF NOT EXISTS table_test01
ON CLUSTER example_cluster
(
UserID UInt32,
ver UInt16
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/01/table_test01', 'p0-tklcdh-clickhouse01-01', ver)
PRIMARY KEY UserID;
执行结果如下:
[root@p0-tklcdh-clickhouse01 clickhouse-client-22.3.2.2]# clickhouse-client --password
ClickHouse client version 22.3.2.1.
Password for user (default):
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 22.3.2 revision 54455.
p0-tklcdh-clickhouse01 :) CREATE TABLE IF NOT EXISTS table_test01
ON CLUSTER example_cluster
(
UserID UInt32,
^Iver UInt16
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/01/table_test01', 'p0-tklcdh-clickhouse01-01', ver)
PRIMARY KEY UserID;
CREATE TABLE IF NOT EXISTS table_test01 ON CLUSTER example_cluster
(
`UserID` UInt32,
`ver` UInt16
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/01/table_test01', 'p0-tklcdh-clickhouse01-01', ver)
PRIMARY KEY UserID
Query id: 90055fe0-7443-4c65-8aa1-4cb09becc393
┌─host───────────────────┬─port─┬─status─┬─error─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─num_hosts_remaining─┬─num_hosts_active─┐
│ p0-tklcdh-clickhouse02 │ 9000 │ 253 │ Code: 253. DB::Exception: Replica /clickhouse/tables/01/table_test01/replicas/p0-tklcdh-clickhouse01-01 already exists. (REPLICA_IS_ALREADY_EXIST) (version 22.3.2.1) │ 2 │ 0 │
│ p0-tklcdh-clickhouse03 │ 9000 │ 0 │ │ 1 │ 0 │
│ p0-tklcdh-clickhouse01 │ 9000 │ 253 │ Code: 253. DB::Exception: Replica /clickhouse/tables/01/table_test01/replicas/p0-tklcdh-clickhouse01-01 already exists. (REPLICA_IS_ALREADY_EXIST) (version 22.3.2.1) │ 0 │ 0 │
└────────────────────────┴──────┴────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────┴──────────────────┘
→ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) 0%
3 rows in set. Elapsed: 0.120 sec.
Received exception from server (version 22.3.2):
Code: 253. DB::Exception: Received from localhost:9000. DB::Exception: There was an error on [p0-tklcdh-clickhouse02:9000]: Code: 253. DB::Exception: Replica /clickhouse/tables/01/table_test01/replicas/p0-tklcdh-clickhouse01-01 already exists. (REPLICA_IS_ALREADY_EXIST) (version 22.3.2.1). (REPLICA_IS_ALREADY_EXIST)
如果显示以上信息则可以初步表明分布式搭建已经成功,所有 Clickhouse 节点组成了一个 Clickhouse 集群并由 Zookeeper 进行任务协同。(由于本例中之前已经做过了一次该操作,故这里有一个 Replica 已经存在的 Exception,可忽略。)
通过官方给定的测试用例,可以系统性的测试 Clickhouse 集群搭建的是否成功,以及查询性能等诸多指标。详见 Clickhouse Testing
[root@p0-tklcdh-clickhouse02 clickhouse-server]# ls /var/log/clickhouse-server/
clickhouse-server.err.log clickhouse-server.log stderr.log stdout.log
(略) 数据合规,信息安全,无法透露请谅解