Ceph集群部署手册

Ceph集群搭建

一、 环境准备(三台服务器一样的配置)

操作系统平台:centos7.3

1.关闭firewalld与selinux 

2.每台服务器添加3块100G硬盘

3.配置ip

centos-01

centos-02

centos-03

192.168.0.118

192.168.0.119

192.168.0.120

4.修改yum源,官网的yum源可能会很慢,所以可以添加ali的

[root@localhost ~ ]# yum clean all 
[root@localhost ~ ]# curl http://mirrors.aliyun.com/repo/Centos-7.repo >/etc/yum.repos.d/CentOS-Base.repo 
[root@localhost ~ ]# curl http://mirrors.aliyun.com/repo/epel-7.repo >/etc/yum.repos.d/epel.repo 
[root@localhost ~ ]# sed -i '/aliyuncs/d' /etc/yum.repos.d/CentOS-Base.repo 
[root@localhost ~ ]# sed -i '/aliyuncs/d' /etc/yum.repos.d/epel.repo 

[root@localhost ~ ]#sed -i 's/$releasever/7/g' /etc/yum.repos.d/CentOS-Base.repo

[root@localhost ~ ]# yum makecache

5.修改静态解析文件/etc/hosts
[root@localhost ~ ]# vim /etc/hosts 
127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4 
::1        localhost localhost.localdomain localhost6 localhost6.localdomain6 
192.168.0.118  centos-01 
192.168.0.119  centos-02 
二、集群搭建

1.集群配置如下:

主机

IP

功能

centos-01

192.168.0.118

deploy、mon*1、osd*3

centos-02

192.168.0.119

mon*1、 osd*3

centos-03

192.168.0.120

mon*1 、osd*3

2.环境清理

如果之前部署失败了,不必删除ceph客户端,或者重新搭建虚拟机,只需要在每个节点上执行如下指令即可将环境清理至刚安装完ceph客户端时的状态!强烈建议在旧集群上搭建之前清理干净环境,否则会发生各种异常情况。
[root@ceph-1 cluster]# ps aux|grep ceph |awk '{print $2}'|xargs kill -9 
[root@ceph-1 cluster]# ps aux|grep ceph    #确保所有进程已经结束 
ps -ef|grep ceph 
#确保此时所有ceph进程都已经关闭!!!如果没有关闭,多执行几次。 
umount /var/lib/ceph/osd/* 
rm -rf /var/lib/ceph/osd/* 
rm -rf /var/lib/ceph/mon/* 
rm -rf /var/lib/ceph/mds/* 
rm -rf /var/lib/ceph/bootstrap-mds/* 
rm -rf /var/lib/ceph/bootstrap-osd/* 
rm -rf /var/lib/ceph/bootstrap-rgw/* 
rm -rf /var/lib/ceph/tmp/* 
rm -rf /etc/ceph/* 
rm -rf /var/run/ceph/*

如果在任何时候遇到问题并想重新开始,请执行以下操作清除Ceph软件包,并清除所有数据和配置:

ceph-deploy purge node1 node2

ceph-deploy purgedata node1 node2

ceph-deploy forgetkeys && rm ceph.*

 

sed -i 's/$releasever/7/g' /etc/yum.repos.d/CentOS-Base.repo

4.增加ceph的源
vim /etc/yum.repos.d/ceph.repo 
添加以下内容:
[ceph] 
name=ceph 
baseurl=http://mirrors.163.com/ceph/rpm-jewel/el7/x86_64/
gpgcheck=0 
[ceph-noarch] 
name=cephnoarch 
baseurl=http://mirrors.163.com/ceph/rpm-jewel/el7/noarch/
gpgcheck=0

5.安装ceph客户端:

yum makecache 
yum install ceph ceph-radosgw -y 

6.开始部署

在部署节点(centos-01)生成ssh秘钥对,并将公钥上传至centos-02、centos-03

[root@centos-01 ~]# ssh-keygen

[root@centos-01 ~]# ssh-copy-id  root@centos-02

[root@centos-01 ~]# ssh-copy-id  root@centos-03

在部署节点(centos-01)安装ceph-deploy,下文的部署节点统一指centos-01:
[root@centos-01 ~]# yum -y install ceph-deploy 
[root@ centos-01 ~]# ceph-deploy --version 
1.5.39 
[root@ centos-01 ~]# ceph -v 
ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)

7.在部署节点创建部署目录并开始部署:

[root@ centos-01  ~]# cd 
[root@ centos-01  ~]# mkdir cluster 
[root@ centos-01  ~]# cd cluster/ 
[root@ centos-01  cluster]# ceph-deploy new centos-01 centos-02 centos-03 

[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf 

[ceph_deploy.cli][INFO  ] Invoked (1.5.34): /usr/bin/ceph-deploy new ceph-1 ceph-2 ceph-3 
[ceph_deploy.cli][INFO  ] ceph-deploy options: 
[ceph_deploy.cli][INFO  ]  username                      : None 
[ceph_deploy.cli][INFO  ]  func                          :  
[ceph_deploy.cli][INFO  ]  verbose                      : False 
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False 
[ceph_deploy.cli][INFO  ]  quiet                        : False 
[ceph_deploy.cli][INFO  ]  cd_conf                      :  
[ceph_deploy.cli][INFO  ]  cluster                      : ceph 
[ceph_deploy.cli][INFO  ]  ssh_copykey                  : True 
[ceph_deploy.cli][INFO  ]  mon                          : ['ceph-1', 'ceph-2', 'ceph-3'] 
.. 
.. 
ceph_deploy.new][WARNIN] could not connect via SSH 
[ceph_deploy.new][INFO  ] will connect again with password prompt 
The authenticity of host 'ceph-2 (192.168.57.223)' can't be established. 
ECDSA key fingerprint is ef:e2:3e:38:fa:47:f4:61:b7:4d:d3:24:de:d4:7a:54. 
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph-2,192.168.57.223' (ECDSA) to the list of known hosts. 
root 
root@ceph-2's password:  
[ceph-2][DEBUG ] connected to host: ceph-2

..
..
[ceph_deploy.new][DEBUG ] Resolving host ceph-3
[ceph_deploy.new][DEBUG ] Monitor ceph-3 at 192.168.57.224
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-1', 'ceph-2', 'ceph-3']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['192.168.57.222', '192.168.57.223', '192.168.57.224']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...

此时,目录内容如下:

[root@ centos-01 cluster]# ls 
ceph.conf  ceph-deploy-ceph.log  ceph.mon.keyring

8.根据自己的IP配置向ceph.conf中添加public_network,并稍微增大mon之间时差允许范围(默认为0.05s,现改为2s):

[root@ centos-01 cluster]# echo public_network=192.168.0.0/24 >> ceph.conf  
[root@ centos-01 cluster]# echo mon_clock_drift_allowed = 2 >> ceph.conf  
[root@ centos-01 cluster]# cat ceph.conf  
[global] 
fsid = 0248817a-b758-4d6b-a217-11248b098e10 
mon_initial_members = ceph-1, ceph-2, ceph-3 
mon_host = 192.168.57.222,192.168.57.223,192.168.57.224 
auth_cluster_required = cephx 
auth_service_required = cephx 
auth_client_required = cephx 
public_network=192.168.57.0/24
mon_clock_drift_allowed = 2

9.开始部署monitor:
[root@ centos-01 cluster]# ceph-deploy mon create-initial 
.. 
..若干log

如果ceph-deploy mon create-initial 报错“ [Errno 2] No such file or directory”

解决方法:修改主机名,清理环境

推送配置文件 #ceph-deploy --overwrite-conf config push node1 node2 node3 

 
[root@ centos-01 cluster]# ls 
ceph.bootstrap-mds.keyring  ceph.bootstrap-rgw.keyring  ceph.conf            ceph.mon.keyring 
ceph.bootstrap-osd.keyring  ceph.client.admin.keyring  ceph-deploy-ceph.log

10.查看集群状态:
[root@ centos-01 cluster]# ceph -s 
    cluster 0248817a-b758-4d6b-a217-11248b098e10 
    health HEALTH_ERR 
            no osds 
            Monitor clock skew detected  
    monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0} 
            election epoch 6, quorum 0,1,2 ceph-1,ceph-2,ceph-3 
    osdmap e1: 0 osds: 0 up, 0 in
            flags sortbitwise 
      pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects 
            0 kB used, 0 kB / 0 kB avail 
                  64 creating

11.开始部署OSD:
ceph-deploy --overwrite-conf osd prepare centos-01:/dev/sdb centos-01:/dev/sdc centos-01:/dev/sdd centos-02:/dev/sdb centos-02:/dev/sdc centos-02:/dev/sdd centos-03:/dev/sdb centos-03:/dev/sdc centos-03:/dev/sdd  --zap-disk 
ceph-deploy --overwrite-conf osd activate centos-01:/dev/sdb1 centos-01:/dev/sdc1 centos-01:/dev/sdd1 centos-02:/dev/sdb1 centos-02:/dev/sdc1 centos-02:/dev/sdd1 centos-03:/dev/sdb1 centos-03:/dev/sdc1 centos-03:/dev/sdd1

集群状态应该如下:
1234567891011 [root@ centos-01 cluster]# ceph -s 
    cluster 0248817a-b758-4d6b-a217-11248b098e10 
    health HEALTH_WARN 
            too few PGs per OSD (21 < min 30) 
    monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0} 
            election epoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3 
    osdmap e45: 9 osds: 9 up, 9 in
            flags sortbitwise 
      pgmap v82: 64 pgs, 1 pools, 0 bytes data, 0 objects 
            273 MB used, 16335 GB / 16336 GB avail 
                  64 active+clean

12.去除这个WARN,只需要增加rbd池的PG就好:
[root@ centos-01 cluster]# ceph osd pool set rbd pg_num 128 
set pool 0 pg_num to 128 
[root@ centos-01  cluster]# ceph osd pool set rbd pgp_num 128 
set pool 0 pgp_num to 128 
[root@ centos-01  cluster]# ceph -s 
    cluster 0248817a-b758-4d6b-a217-11248b098e10 
    health HEALTH_ERR 
            19 pgs are stuck inactive for more than 300 seconds 
            12 pgs peering 
            19 pgs stuck inactive 
    monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0} 
            election epoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3 
    osdmap e49: 9 osds: 9 up, 9 in
            flags sortbitwise 
      pgmap v96: 128 pgs, 1 pools, 0 bytes data, 0 objects 
            308 MB used, 18377 GB / 18378 GB avail 
                103 active+clean 
                  12 peering 
                  9 creating 
                  4 activating

[root@ centos-01  cluster]# ceph -s 
    cluster 0248817a-b758-4d6b-a217-11248b098e10 
    health HEALTH_OK 
    monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0} 
            election epoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3 
    osdmap e49: 9 osds: 9 up, 9 in
            flags sortbitwise 
      pgmap v99: 128 pgs, 1 pools, 0 bytes data, 0 objects 
            310 MB used, 18377 GB / 18378 GB avail 
                128 active+clean

至此,集群部署完毕。

13.config推送

请不要使用直接修改某个节点的/etc/ceph/ceph.conf文件的方式,而是去部署节点(此处为ceph-1:/root/cluster/ceph.conf)目录下修改。因为节点到几十个的时候,不可能一个个去修改的,采用推送的方式快捷安全!
修改完毕后,执行如下指令,将conf文件推送至各个节点:

[root@ centos-01  cluster]# ceph-deploy --overwrite-conf config push centos-01 centos-02  centos-03

此时,需要重启各个节点的monitor服务,见下一节。

14.mon&osd启动方式

# centos-01为各个monitor所在节点的主机名。 
systemctl start ceph-mon@ centos-01.service  
systemctl restart ceph-mon@ centos-01.service 
systemctl stop ceph-mon@ centos-01.service 


#0为该节点的OSD的id,可以通过`ceph osd tree`查看 
systemctl start/stop/restart [email protected] 
[root@ centos-01  cluster]# ceph osd tree 
ID WEIGHT  TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY  
-1 17.94685 root default                                      
-2  5.98228    host ceph-1                                    
 0  1.99409        osd.0        up  1.00000          1.00000  
 1  1.99409        osd.1        up  1.00000          1.00000  
 8  1.99409        osd.2        up  1.00000          1.00000  
-3  5.98228    host ceph-2                                    
 2  1.99409        osd.3        up  1.00000          1.00000  
 3  1.99409        osd.4        up  1.00000          1.00000  
 4  1.99409        osd.5        up  1.00000          1.00000  
-4  5.98228    host ceph-3                                    
 5  1.99409        osd.6        up  1.00000          1.00000  
 6  1.99409        osd.7        up  1.00000          1.00000  
 7  1.99409        osd.8        up  1.00000          1.00000

 

 

转载于:https://www.cnblogs.com/jimmyyang/p/10791390.html

你可能感兴趣的:(Ceph集群部署手册)