clickhouse + chproxy 集群搭建

集群规划

集群架构

其中 Distribute 是指创建分布式表的机器,在此文章中是将 Distribute 单独部署的,也可以将 Distribute 角色划分到每个 Replica 机器上,即在所有副本机器上创建相同的分布式表,可以使用 create table tbl on cluster 'cluster_name'

角色分布

本次安装中使用了 5 个 zookeeper 节点的集群,这个对于安装 clickhouse 不是必须的。


安装步骤

基础环境准备

1. 安装 clustershell

$ yum install -y clustershell 
$ vi /etc/clustershell/groups

all: clickhouse-node-[01-14]
replica1:clickhouse-node-[07,10,13]
replica2:clickhouse-node-[08,11,14]
distributed:clickhouse-node-[06,09,12]
chproxy:clickhouse-node-[06,09,12]

$ clush -a 'uptime'

2. 免密登陆

$ chmod 755 ~/.ssh 
$ ssh-keygen -t rsa 
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 
$ chmod 600 ~/.ssh/authorized_keys 
$ vi /etc/ssh/sshd_config PubkeyAuthentication yes # 启⽤公钥私钥配对认证⽅式,ssh-copy-id需要此权限 
$ service sshd restart $ ssh-copy-id -i ~/.ssh/id_rsa.pub root@xxxx

安装 Clickhouse

1. 安装 RPM 包

在所有机器上下载并安装 curl下载安装脚本/包

clush -g all -b 'yum install -y curl'
在 replica, distributed, chproxy 机器上下载并执行 packagecloud.io 提供的 clickhouse 安装脚本

clush -g replica1,replica2,distributed,chproxy -b 'curl -s https://packagecloud.io/install/repositories/altinity/clickhouse/script.rpm.sh | sudo bash'

将 clickhouse-server, clickhouse-client 安装到 replica 和 distributed 机器上

# check for availables
$ clush -g replica1,replica2,distributed -b 'sudo yum list "clickhouse*"'

# install
$ clush -g replica1,replica2,distributed -b 'sudo yum install -y clickhouse-server clickhouse-client clickhouse-compressor'

2. 修改 ulimit 配置

$ vi /etc/security/limits.d/clickhouse.conf

# 添加 core file size,允许程序 crash 时创建的 dump 文件大小
clickhouse       soft    core    1073741824
clickhouse       hard    core    1073741824

3. 修改启动脚本

$ /etc/init.d/clickhouse-server

CLICKHOUSE_LOGDIR=/data/clickhouse/log

4.修改集群配置

根据以下文件参数修改 /etc/clickhouse-server/config.xml,以下只包含需要修改替换的配置



    
        trace
        /data/clickhouse/logs/server.log
        /data/clickhouse/logs/error.log
        1000M
        10
    
    
    8123
    9000
    9009
    0.0.0.0

    /data/clickhouse/
    /data/clickhouse/tmp/
    users.xml

    default
    default
    

    
    
    /etc/clickhouse-server/metrika.xml

创建 /etc/clickhouse-server/metrika.xml



    
    
            true
            
                clickhouse-node-07
                9000
                default
                6lYaUiFi
            
            
                clickhouse-node-08
                9000
                default
                6lYaUiFi
            
        
        
            true
            
                clickhouse-node-10
                9000
                default
                6lYaUiFi
            
            
                clickhouse-node-11
                9000
                default
                6lYaUiFi
            
        
        
            true
            
               clickhouse-node-13
               9000
               default
               6lYaUiFi
            
            
               clickhouse-node-14
               9000
               default
               6lYaUiFi
            
        
    



    
        clickhouse-node-01
        2181
    
    
        clickhouse-node-02
        2181
    
    
        clickhouse-node-03
        2181
    
    
        clickhouse-node-04
        2181
    
    
        clickhouse-node-05
        2181
    



    cluster-1
    host_name
    shard_number



    
        10000000000
        0.01
        lz4 
   


将 /etc/clickhouse-server/users.xml 修改为如下内容



    
        
        
            10000000000
            0
            random
        
        
        
            10000000000
            0
            random
            1
        
    
    
    
        
        
            
                3600
                0
                0
                0
                0
                0
            
        
    
    
        
        
            967f3bf355dddfabfca1c9f5cab39352b2ec1cd0b05f9e1e6b8f629705fe7d6e
            
                ::/0
            
            default
            default
        
        
        
            967f3bf355dddfabfca1c9f5cab39352b2ec1cd0b05f9e1e6b8f629705fe7d6e
            
                ::/0
            
            readonly
            default
        
    

5. 同步配置

将 clickhouse 用户设置为 login 用户

$ clush -g replica1,replica2,distributed -b 'usermod -s /bin/bash clickhouse'

将 clickhouse 放置到 /data/clickhouse/ 下

$ clush -g replica1,replica2,distributed -b 'mkdir /data/clickhouse/logs -p'
$ clush -g replica1,replica2,distributed -b 'chown clickhouse.clickhouse /data/clickhouse/ -R'

将配置文件复制到所有的 clickhouse 机器

$ clush -g replica1,replica2,distributed -b --copy /etc/security/limits.d/clickhouse.conf --dest /etc/security/limits.d/
$ clush -g replica1,replica2,distributed -b --copy /etc/init.d/clickhouse-server --dest /etc/init.d
$ clush -g replica1,replica2,distributed -b --copy /etc/clickhouse-server/config.xml --dest /etc/clickhouse-server/
$ clush -g replica1,replica2,distributed -b --copy /etc/clickhouse-server/users.xml --dest /etc/clickhouse-server/
$ clush -g replica1,replica2,distributed -b --copy /etc/clickhouse-server/metrika.xml --dest /etc/clickhouse-server/

修改各个机器的变量

# replace hostname
$ clush -g replica1,replica2,distributed -b 'sed -i "s/host_name/"$HOSTNAME"/" /data/clickhouse/metrika.xml'

# replace shard_number
$ clush -w clickhouse-node-[06-08] -b 'sed -i "s/shard_number/1/" /data/clickhouse/metrika.xml'
$ clush -w clickhouse-node-[09-11] -b 'sed -i "s/shard_number/2/" /data/clickhouse/metrika.xml'
$ clush -w clickhouse-node-[12-14] -b 'sed -i "s/shard_number/3/" /data/clickhouse/metrika.xml'

6. 重新启动服务

# restart server
$ clush -g replica1,replica2,distributed -b 'service clickhouse-server restart'

# login with password
$ clickhouse-client -h 127.0.0.1 -d default -m -u default --password 6lYaUiFi 

登陆机器创建 local table

$ clickhouse-client

CREATE TABLE monchickey.image_label (
    label_id UInt32, 
    label_name String, 
    insert_time Date
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/01-01/image_label','cluster01-01-1',insert_time, (label_id, insert_time), 8192)

安装 chproxy

1. 下载 chproxy

https://github.com/Vertamedia/chproxy/releases

$ mkdir -p /data/chproxy
$ cd /data/chproxy
$ wget https://github.com/Vertamedia/chproxy/releases/download/1.13.0/chproxy-linux-amd64-1.13.0.tar.gz
$ tar -xzvf chproxy-*.gz

chproxy-linux-amd64

2. 配置文件

新建 /data/chproxy/config.yml

设置这个可以关闭安全检查 hack_me_please: true

server:
  http:
      listen_addr: ":9090"
      allowed_networks: ["172.0.0.0/8"]

users:
  - name: "distributed-write"
    to_cluster: "distributed-write"
    to_user: "default"

  - name: "replica-write"
    to_cluster: "replica-write"
    to_user: "default"

  - name: "distributed-read"
    to_cluster: "distributed-read"
    to_user: "readonly"
    max_concurrent_queries: 6
    max_execution_time: 1m

clusters:
  - name: "replica-write"
    replicas:
      - name: "replica1"
        nodes: ["clickhouse-node-07:8123", "clickhouse-node-10:8123", "clickhouse-node-13:8123"]
      - name: "replica2"
        nodes: ["clickhouse-node-08:8123", "clickhouse-node-11:8123", "clickhouse-node-14:8123"]
    users:
      - name: "default"
        password: "6lYaUiFi"

  - name: "distributed-write"
    nodes: [
      "clickhouse-node-06:8123",
      "clickhouse-node-09:8123",
      "clickhouse-node-12:8123"
    ]
    users:
      - name: "default"
        password: "6lYaUiFi"

  - name: "distributed-read"
    nodes: [
      "clickhouse-node-06:8123",
      "clickhouse-node-09:8123",
      "clickhouse-node-12:8123"
    ]
    users:
    - name: "readonly"
      password: "6lYaUiFi"

caches:
  - name: "shortterm"
    dir: "/data/chproxy/cache/shortterm"
    max_size: 150Mb
    expire: 130s
新建 /data/chproxy/restart.sh
$ vim /data/chproxy/restart.sh

#!/bin/bash
cd $(dirname)
ps -ef | grep chproxy | head -2 | tail -1 | awk '{print $2}' | xargs kill -9
sudo -u chproxy nohup ./chproxy-linux-amd64 -config=./config/config.yml >> ./logs/chproxy.out 2>&1 &

3.分布式安装

创建用户
$ clush -g distributed -b 'useradd chproxy'
创建目录
$ clush -g distributed -b 'mkdir -p /data/chproxy/logs'
$ clush -g distributed -b 'mkdir -p /data/chproxy/config'
$ clush -g distributed -b 'mkdir -p /data/chproxy/cache/shortterm'
$ clush -g distributed -b 'mkdir -p /data/chproxy/cache/longterm'
分发文件
$ clush -g distributed -b --copy /data/chproxy/chproxy-linux-amd64 --dest /data/chproxy/
$ clush -g distributed -b --copy /data/chproxy/config.yml --dest /data/chproxy/config/
$ clush -g distributed -b --copy /data/chproxy/restart.sh --dest /data/chproxy/
修改目录权限
$ clush -g distributed -b 'chown -R chproxy.chproxy /data/chproxy'

4. 启动 chproxy

$ clush -g distributed -b 'bash /data/chproxy/restart.sh'
$ clush -g distributed -b 'ps -ef | grep chproxy' # check

检查 http 接口

# clichhouse
$ echo 'SELECT 1' | curl 'http://localhost:8123/?user=default&password=6lYaUiFi' --data-binary @-
1

echo 'SELECT 1' | curl 'http://default:6lYaUiFi@localhost:8123/' --data-binary @-
1

echo 'SELECT 1' | curl 'http://readonly:6lYaUiFi@localhost:8123/' --data-binary @-
1

# chproxy
echo 'SELECT 1' | curl 'http://clickhouse-node-06:9090/?user=distributed-read&password=' --data-binary @-
1

echo 'SELECT 1' | curl 'http://clickhouse-node-06:9090/?user=distributed-write&password=' --data-binary @-
1

监控

clickhouse 监控模板

https://github.com/Vertamedia/clickhouse-grafana

各个库表,占用的存储空间大小

SELECT
database,
table,
formatReadableSize ( sum( bytes ) ) AS size
FROM
system.parts
WHERE
active
GROUP BY database,table
ORDER BY sum( bytes ) DESC

遇到的问题

问题 #1

操作:

直接启动 clickhouse-server 会报 ulimit 相关的错误

clickhouse-server --config-file=/etc/clickhouse-server/config.xml
报错:
Include not found: clickhouse_remote_servers
Include not found: clickhouse_compression
Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = Exception: Cannot set max size of core file to 1073741824, e.what() = Exception
解决:
$ vi /etc/security/limits.d/clickhouse.conf
 
# 添加 core file size,允许程序 crash 时创建的 dump 文件大小
clickhouse       soft    core    1073741824
clickhouse       hard    core    1073741824

问题 #2

操作:
$ clickhouse-server --config-file=/etc/clickhouse-server/config.xml
报错:
Include not found: clickhouse_remote_servers
Include not found: clickhouse_compression
Logging trace to /var/log/clickhouse-server/clickhouse-server.log
Logging errors to /var/log/clickhouse-server/clickhouse-server.err.log
Logging trace to console
2019.02.13 15:15:36.539294 [ 1 ] {}  : Starting ClickHouse 19.1.6 with revision 54413
2019.02.13 15:15:36.543324 [ 1 ] {}  Application: starting up
2019.02.13 15:15:36.547676 [ 1 ] {}  Application: DB::Exception: Effective user of the process (root) does not match the owner of the data (clickhouse). Run under 'sudo -u clickhouse'.
2019.02.13 15:15:36.547714 [ 1 ] {}  Application: shutting down
2019.02.13 15:15:36.547729 [ 1 ] {}  Application: Uninitializing subsystem: Logging Subsystem
2019.02.13 15:15:36.547809 [ 2 ] {}  BaseDaemon: Stop SignalListener thread

解决:

$ sudo -u clickhouse clickhouse-server --config-file=/etc/clickhouse-server/config.xml

问题 #3

操作:
$ sudo -u clickhouse clickhouse-server --config-file=/etc/clickhouse-server/config.xml
报错:
Include not found: clickhouse_remote_servers
Include not found: clickhouse_compression
Couldn't save preprocessed config to /var/lib/clickhouse//preprocessed_configs/config.xml: Access to file denied: /var/lib/clickhouse//preprocessed_configs/config.xml
Logging trace to /var/log/clickhouse-server/clickhouse-server.log
Poco::Exception. Code: 1000, e.code() = 13, e.displayText() = Access to file denied: /var/log/clickhouse-server/clickhouse-server.log, e.what() = Access to file denied

解决:

chown -R clickhouse /var/log/clickhouse-server/

问题 #4

操作:
$ sudo -u clickhouse clickhouse-server --config-file=/etc/clickhouse-server/config.xml
报错:
Logging trace to /var/log/clickhouse-server/clickhouse-server.log
Logging errors to /var/log/clickhouse-server/clickhouse-server.err.log
Logging trace to console
2019.02.13 16:39:48.812708 [ 1 ] {}  : Starting ClickHouse 19.1.6 with revision 54413
2019.02.13 16:39:48.815644 [ 1 ] {}  Application: starting up
2019.02.13 16:39:48.819798 [ 1 ] {}  Application: rlimit on number of file descriptors is 262144
2019.02.13 16:39:48.819827 [ 1 ] {}  Application: Initializing DateLUT.
2019.02.13 16:39:48.819850 [ 1 ] {}  Application: Initialized DateLUT with time zone `Asia/Shanghai'.
2019.02.13 16:39:48.820256 [ 1 ] {}  Application: Configuration parameter 'interserver_http_host' doesn't exist or exists and empty. Will use 'clickhouse-node-13' as replica host.
2019.02.13 16:39:48.822770 [ 1 ] {}  ConfigReloader: Loading config `/data/clickhouse/users.xml'
Include not found: networks
Include not found: networks
2019.02.13 16:39:48.823764 [ 1 ] {}  Application: Loading metadata.
2019.02.13 16:39:48.829479 [ 1 ] {}  Application: Loaded metadata.
2019.02.13 16:39:48.829592 [ 1 ] {}  BackgroundProcessingPool: Create BackgroundProcessingPool with 16 threads
2019.02.13 16:39:48.829762 [ 3 ] {}  DDLWorker: Started DDLWorker thread
2019.02.13 16:39:48.834746 [ 3 ] {}  ZooKeeper: initialized, hosts: clickhouse-node-03:2181,clickhouse-node-02:2181,clickhouse-node-05:2181,clickhouse-node-01:2181,clickhouse-node-04:2181
2019.02.13 16:39:48.834875 [ 1 ] {}  Application: It looks like the process has no CAP_NET_ADMIN capability, 'taskstats' performance statistics will be disabled. It could happen due to incorrect ClickHouse package installation. You could resolve the problem manually with 'sudo setcap cap_net_admin=+ep /usr/bin/clickhouse'. Note that it will not work on 'nosuid' mounted filesystems. It also doesn't work if you run clickhouse-server inside network namespace as it happens in some containers.
2019.02.13 16:39:48.835531 [ 1 ] {}  Application: Listen [::1]: 99: Net Exception: Cannot assign requested address: [::1]:8123  If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to specify not disabled IPv4 or IPv6 address to listen in  element of configuration file. Example for disabled IPv6: 0.0.0.0 . Example for disabled IPv4: ::
2019.02.13 16:39:48.835636 [ 1 ] {}  Application: Listening http://127.0.0.1:8123
2019.02.13 16:39:48.835684 [ 1 ] {}  Application: Listening tcp: 127.0.0.1:9000
2019.02.13 16:39:48.835734 [ 1 ] {}  Application: Listening interserver http: 127.0.0.1:9009
2019.02.13 16:39:48.836105 [ 1 ] {}  Application: Available RAM = 31.26 GiB; physical cores = 8; threads = 16.
2019.02.13 16:39:48.836120 [ 1 ] {}  Application: Ready for connections.
2019.02.13 16:39:48.838717 [ 3 ] {}  DDLWorker: Processing tasks
2019.02.13 16:39:48.838977 [ 3 ] {}  DDLWorker: Waiting a watch
2019.02.13 16:39:50.838820 [ 23 ] {}  ConfigReloader: Loading config `/data/clickhouse/config.xml'
解决:
$ vim /etc/clickhouse-server/config.xml

0.0.0.0

问题 #5

操作:
$ clush -g replica1,replica2 -b 'service clickhouse-server stop'
问题:

8123 端口相关的进程不能被停止

解决:
$ lsof -i :8123 | grep clickhouse | awk '{print $2}' | xargs kill -9
# or
$ service clickhouse-server forcestop

参考

  • ClickHouse集群搭建从0到1
  • https://zhuanlan.zhihu.com/p/34669883
  • https://clustershell.readthedocs.io/en/latest/index.html
  • https://github.com/Altinity/clickhouse-rpm-install
  • https://github.com/Vertamedia/chproxy
  • https://www.altinity.com/blog/2017/6/5/clickhouse-data-distribution
  • https://www.altinity.com/blog/2018/5/10/circular-replication-cluster-topology-in-clickhouse
  • http://jackpgao.github.io/
  • https://hzkeung.com/2018/06/21/clickhouse-cluster-install

你可能感兴趣的:(clickhouse + chproxy 集群搭建)