部署的详情可以看官网
先部署两个server,三个keeper[zookeeper]
clickhouse之前依赖的存储是zookeeper,后来改为了keeper,官网给出了原因
所以这就决定了clickhouse有两种安装方式,依赖于keeper做存储或者依赖于zookeeper做存储
zookeeper安装可以看之前的文章
修改配置文件
<listen_host>0.0.0.0listen_host>
<path>/var/lib/clickhouse/path>
<remote_servers>
<cluster_2S_1R>
<shard>
<internal_replication>trueinternal_replication>
<replica>
<host>chnode1host>
<port>9000port>
replica>
shard>
<shard>
<internal_replication>trueinternal_replication>
<replica>
<host>chnode2host>
<port>9000port>
replica>
shard>
cluster_2S_1R>
remote_servers>
<!--
注意,上面的写法是放到两个shard里,也可放到一个shard,下面是单一分片两副本的写法,如果放到不同的shard里macros的配置就得不同了
>
<cluster_2S_1R>
<shard>
<internal_replication>trueinternal_replication>
<replica>
<host>chnode1host>
<port>9000port>
replica>
<replica>
<host>chnode2host>
<port>9000port>
replica>
shard>
cluster_2S_1R>
remote_servers>
-->
<zookeeper>
<node>
<host>example1host>
<port>2181port>
node>
<node>
<host>example2host>
<port>2181port>
node>
<node>
<host>example3host>
<port>2181port>
node>
zookeeper>
<macros>
<shard>01shard>
<replica>chnode1replica>
macros>
<macros>
<shard>01shard>
<replica>chnode2replica>
macros>
systemctl start clickhouse-server.service
systemctl enable clickhouse-server.service
# 登录
clickhouse-client
# 查看集群信息
select * from system.clusters
CREATE TABLE t1 ON CLUSTER cluster_2S_1R
(
`ts` DateTime,
`uid` String,
`biz` String
)
ENGINE = ReplicatedMergeTree('/clickhouse/test1/tables/{shard}/t1', '{replica}')
PARTITION BY toYYYYMMDD(ts)
ORDER BY ts
SETTINGS index_granularity = 8192
# 出现如下报错
Received exception from server (version 23.6.2):
Code: 159. DB::Exception: Received from localhost:9000. DB::Exception: Watching task /clickhouse/task_queue/ddl/query-0000000004 is executing longer than distributed_ddl_task_timeout (=180) seconds. There are 2 unfinished hosts (0 of them are currently active), they are going to execute the query in background. (TIMEOUT_EXCEEDED)
# 这个报错是某些ck服务异常才出现的报错,我这是因为我配置文件里的remote_server里的host ip写错了,相当于找不到服务了,修改后重启就好了
目前DDL生效,但是插入数据在其他节点不生效
查看节点2的clickhouse日志,其中会有如下报错
2023.08.10 15:49:54.836507 [ 8514 ] {} <Error> test1.t1 (*****-48d4-44ed-9bad-2a03410321a9): auto DB::StorageReplicatedMergeTree::processQueueEntry(ReplicatedMergeTreeQueue::SelectedEntryPtr)::(anonymous class)::operator()(LogEntryPtr &) const: Code: 198. DB::Exception: Not found address of host: bj-ck3. (DNS_ERROR), Stack trace (when copying this message, always include the lines below):
可以看到这里是因为域名无法解析,因为ZooKeeper 里面存储的是hosts域名,不是IP,所以需要配置/etc/hosts
192.168.1.1 bj-ck1
192.168.1.2 bj-ck2
192.168.1.3 bj-ck3
ps: /etc/hosts的配置里,如果配置多个的话,是以第一个为准,其他都类似别名么
比如192.168.1.1配置如下:192.168.1.1 bj-1 bj-2
如果别的机器是以域名访问192.168.1.1,如果别的机器只配置了192.168.1.1 bj-2,其实是解析不到192.168.1.1的
ClickHouse Keeper 提供数据复制和分布式 DDL 查询执行的协调系统。 ClickHouse Keeper 与 Apache ZooKeeper 兼容。 此配置在端口 9181 上启用 ClickHouse Keeper。
注意:
如果出于任何原因更换或重建 Keeper 节点,请勿重复使用现有的 server_id。 例如,如果重建了server_id为2的Keeper节点,则将其server_id设置为4或更高。
分片和副本降低了分布式 DDL 的复杂性。 配置的值会自动替换到您的 DDL 查询中,从而简化您的 DDL。
# 安装clickhouse-keeper
sudo apt-get install -y clickhouse-keeper
# 启用并启动clickhouse-keeper
sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper
<keeper_server>
<tcp_port>9181tcp_port>
<server_id>1server_id>
<log_storage_path>/var/lib/clickhouse/coordination/logslog_storage_path>
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshotssnapshot_storage_path>
<coordination_settings>
<operation_timeout_ms>10000operation_timeout_ms>
<min_session_timeout_ms>10000min_session_timeout_ms>
<session_timeout_ms>100000session_timeout_ms>
<raft_logs_level>informationraft_logs_level>
coordination_settings>
<hostname_checks_enabled>truehostname_checks_enabled>
<raft_configuration>
<server>
<id>1id>
<hostname>192.168.1.1hostname>
<port>9234port>
server>
<server>
<id>2id>
<hostname>192.168.1.2hostname>
<port>9234port>
server>
<server>
<id>3id>
<hostname>192.168.1.3hostname>
<port>9234port>
server>
raft_configuration>
keeper_server>
<clickhouse>
<zookeeper>
<node index="1">
<host>chnode1host>
<port>9181port>
node>
<node index="2">
<host>chnode2host>
<port>9181port>
node>
<node index="3">
<host>chnode3host>
<port>9181port>
node>
zookeeper>
clickhouse>
clickhouse的配置与zookeeper作为存储时的配置几乎一致,只需要把zookeeper的配置注释掉即可
ps: 这里还有个小插曲,使用keeper的时候发现dml的数据又一次不同步了,查看clickhouse-server.err.log,发现有如下报错
2023.08.16 11:19:00.782071 [ 8566 ] {} <Error> ConfigReloader: Error updating configuration from '/etc/clickhouse-server/config.xml' config.: Code: 999. Coordination::Exception: Connection loss, path: All connection tries failed while connecting to ZooKeeper
使用telnet后发现确实telnet不通,于是修改keeper的配置文件keeper_config.xml,添加如下内容
<listen_host>0.0.0.0</listen_host>
重启keeper
systemctl restart clickhouse-keeper