Quantum Insights 的部署将基于一个高可用的分布式 ClickHouse 集群,以实现对大规模数据的高效处理和查询。
在每台服务器上执行以下步骤进行 ClickHouse 的安装。
> yum install -y yum-utils
> yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
> yum install -y clickhouse-server clickhouse-client
在每台服务器的配置文件 /etc/clickhouse-server/config.d/metrika.xml
中添加集群配置。假设有三台服务器,IP 分别为 192.168.1.1
, 192.168.1.2
, 192.168.1.3
。
<yandex>
<remote_servers>
<ycloud_clickhouse_cluster>
<shard>
<internal_replication>trueinternal_replication>
<replica>
<host>192.168.1.1host>
<port>9920port>
<user>defaultuser>
<password>ycloud_123password>
replica>
shard>
<shard>
<internal_replication>trueinternal_replication>
<replica>
<host>192.168.1.2host>
<port>9920port>
<user>defaultuser>
<password>ycloud_123password>
replica>
shard>
<shard>
<internal_replication>trueinternal_replication>
<replica>
<host>192.168.1.3host>
<port>9920port>
<user>defaultuser>
<password>ycloud_123password>
replica>
shard>
ycloud_clickhouse_cluster>
remote_servers>
<zookeeper>
<node index="1">
<host>192.168.1.10host>
<port>2284port>
node>
<node index="2">
<host>192.168.1.11host>
<port>2284port>
node>
<node index="3">
<host>192.168.1.12host>
<port>2284port>
node>
zookeeper>
<macros>
<shard>01shard> ### 根据节点填写
<replica>192.168.1.1replica> ### 根据节点填写
macros>
<networks>
<ip>::/0ip>
networks>
<clickhouse_compression>
<case>
<min_part_size>10000000000min_part_size>
<max_partitions_per_insert_block>0max_partitions_per_insert_block>
<min_part_size_ratio>0.01min_part_size_ratio>
<method>lz4method>
case>
clickhouse_compression>
yandex>
⚡️: zk 自行创建 ,macros根据节点填写,port部分我是调整过的根据实际情况填写
重启 ClickHouse 服务器以应用新的配置。
> systemctl restart clickhouse-server
在每台服务器上创建本地表 gatewaySvrAls
。
CREATE TABLE default.gatewaySvrAls ON CLUSTER ycloud_clickhouse_cluster
(
logtime DateTime,
method String,
orderId String,
transactionid String,
serialnumber String,
code Int32,
reserveNo String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/default/gatewaySvrAls', '{replica}')
PARTITION BY toYYYYMMDD(logtime)
PRIMARY KEY (logtime,
orderId,
serialnumber)
ORDER BY (logtime,
orderId,
serialnumber)
SETTINGS index_granularity = 8192;
在每台服务器上创建分布式表 gatewaySvrAls_all
,以对本地表进行分布式查询。
CREATE TABLE default.gatewaySvrAls_all ON CLUSTER ycloud_clickhouse_cluster AS default.gatewaySvrAls
ENGINE = Distributed(ycloud_clickhouse_cluster, default, gatewaySvrAls, rand())
在集群任意节点上执行
DROP TABLE default.gatewaySvrAls ON CLUSTER ycloud_clickhouse_cluster;
DROP TABLE default.gatewaySvrAls_all ON CLUSTER ycloud_clickhouse_cluster;
INSERT INTO gatewaySvrAls VALUES ('2024-05-15 14:19:24', 'ycloud_1', '123456', 'YCLOUD123456', '', '100', '');
INSERT INTO gatewaySvrAls VALUES ('2024-05-15 14:19:25', 'ycloud_2', '123456', 'YCLOUD123456', '', '100', '');
在分布式表上执行查询,以验证集群配置和分布式表的工作情况。
SELECT * FROM default.gatewaySvrAls;
为了确保 Quantum Insights 的高可用性和稳定性,可以配置监控和报警系统,如 Prometheus 和 Grafana,并定期备份数据。
定期备份 ClickHouse 数据以防数据丢失。
# 在每台服务器上定期执行备份
clickhouse-backup create backup_name
clickhouse-backup upload backup_name
数据迁移需要确保数据从旧系统到新系统的完整性和一致性,本次使用的官方客户端进行数据迁移
clickhouse-client -h 192.168.1.1 --port 9920 -u default --password ycloud_123 --query="SELECT * FROM default.gatewaySvrAls" --format=CSV > output.csv
clickhouse-client --host 192.168.1.1--port 9920 --user default --password ycloud_123 --query="INSERT INTO default.gatewaySvrAls FORMAT CSV" < output.csv