yum clean all
rm -rf /etc/yum.repos.d/*.repo
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
sed -i '/aliyuncs/d' /etc/yum.repos.d/CentOS-Base.repo
sed -i '/aliyuncs/d' /etc/yum.repos.d/epel.repo
sed -i 's/$releasever/7.3.1611/g' /etc/yum.repos.d/CentOS-Base.repo
vim /etc/yum.repos.d/ceph.repo
[ceph]
name=ceph
baseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/x86_64/
gpgcheck=0
[ceph-noarch]
name=cephnoarch
baseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/noarch/
gpgcheck=0
[ceph-source]
name=cephsource
baseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/x86_64/
gpgcheck=0[ceph-radosgw]
name=cephradosgw
baseurl=http://mirrors.aliyun.com/ceph/rpm-jewel/el7/x86_64/
gpgcheck=0
yum makecache
由于条件限制,本文所有实验机器全是虚拟机,共准备了3台虚拟机,其中1台做监控节点(master),2台做存储节点(slave1,slave2)。
Ceph要求必须是奇数个监控节点,而且最少3个(自己玩玩的话,1个也是可以的),ceph-adm是可选的,可以把ceph-adm放在monitor上,只不过把ceph-adm单独拿出来架构上看更清晰一些。当然也可以把mon放在 osd上,生产环境下是不推荐这样做的。
所有Ceph集群节点采用CentOS-7.3.1611版本,所有文件系统采用Ceph官方推荐的xfs。
安装完CentOS后我们需要在每个节点上(包括master)做一点基础配置,比如关闭SELINUX、关闭防火墙、同步时间等。
关闭 SELINUX
# sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
# setenforce 0
关闭iptables
# systemctl stop firewalld
# systemctl disable firewalld
同步时间
# yum -y ntp
# ntpdate asia.pool.ntp.org
修改/etc/hosts
# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.63.128 master
192.168.63.130 slave1
192.168.63.131 slave2
在每台osd服务器上对1块硬盘进行分区,创建XFS文件系统,对1块用作journal的硬盘分1个区,每个区对应一块硬盘,不需要创建文件系统,留给Ceph自己处理。
# parted -a optimal --script /dev/sdb -- mktable gpt
# parted -a optimal --script /dev/sdb -- mkpart primary xfs 0% 100%
# mkfs.xfs -f /dev/sdb
meta-data=/dev/sdb isize=256 agcount=4, agsize=1310592 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=5242368, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
# parted -a optimal --script /dev/sdc -- mktable gpt
# parted -a optimal --script /dev/sdc -- mkpart primary xfs 0% 100%
# mkfs.xfs -f /dev/sdc
meta-data=/dev/sdc1 isize=256 agcount=4, agsize=1310592 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=5242368, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
在生产环境中,每台osd服务器上硬盘远不止2台,以上命令需要对多个硬盘进行处理,重复的操作太多,以后还会陆续增加服务器,写成脚本parted.sh方便操作,其中/dev/sdc|d|e|f分别是4块硬盘,/dev/sdb是用做journal的硬盘:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
#!/bin/bash set -e if [ ! -x "/sbin/parted" ]; then echo "This script requires /sbin/parted to run!" >&2 exit 1 fi DISKS="c d e f" for i in ${DISKS}; do echo "Creating partitions on /dev/sd${i} ..." parted -a optimal --script /dev/sd${i} -- mktable gpt parted -a optimal --script /dev/sd${i} -- mkpart primary xfs 0% 100% sleep 1 #echo "Formatting /dev/sd${i}1 ..." mkfs.xfs -f /dev/sd${i}1 & done JOURNALDISK="b" for i in ${JOURNALDISK}; do parted -s /dev/sd${i} mklabel gpt parted -s /dev/sd${i} mkpart primary 0% 25% parted -s /dev/sd${i} mkpart primary 26% 50% parted -s /dev/sd${i} mkpart primary 51% 75% parted -s /dev/sd${i} mkpart primary 76% 100% done |
比起在每个Ceph节点上手动安装Ceph,用ceph-deploy工具统一安装要方便得多,登录master:
# yum install ceph-deploy -y
创建一个ceph工作目录,以后的操作都在这个目录下面进行:
[root@master ~]# mkdir ~/ceph-cluster
[root@master ~]# cd ceph-cluster/
初始化集群,告诉ceph-deploy哪些节点是监控节点,命令成功执行后会在ceph-cluster目录下生成ceph.conf,ceph.log,ceph.mon.keyring等相关文件:
[root@master ceph-cluster]# ceph-deploy new master
在每个Ceph节点上都安装Ceph:
[root@master ceph-cluster]# ceph-deploy install master slave1 slave2
此处可能出现类似如下错误:
[master
][ERROR ] File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
[master
][ERROR ] raise child_exception
[master
][ERROR ] OSError: [Errno 2] No such file or directory
[master
][ERROR ]
[master
][ERROR ]
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: ceph --version
解决方法是在报错的节点上执行下面的命令:
[root@
ceph-cluster]# yum install *argparse* -y
master
初始化监控节点:
[root@
ceph-cluster]# ceph-deploy mon create-initial
master
查看一下Ceph存储节点的硬盘情况:
[root@
ceph-cluster]# ceph-deploy disk list slave1 [root@
master
ceph-cluster]# ceph-deploy disk list slave2
master
初始化Ceph硬盘,然后创建osd存储节点,存储节点:单个硬盘:对应的journal分区,一一对应:
创建slave1存储节点 [root@master ceph-cluster]# ceph-deploy disk zap slave1:sdb [root@master ceph-cluster]# ceph-deploy osd create
ceph-deploy osd createslave1
:sdb:/dev/sdc1 创建slave2存储节点 [root@master ceph-cluster]# ceph-deploy disk zap slave2:sdb [root@master ceph-cluster]#slave1
:sdb:/dev/sdc1
最后,我们把生成的配置文件从master同步部署到其他几个节点,使得每个节点的ceph配置一致:
[root@master ceph-cluster]# ceph-deploy --overwrite-conf admin master slave1 slave2
测试
看一下配置成功了没?
[root@master ceph-cluster]# ceph health
[root@ceph-mon ceph-cluster]# ceph osd pool set rbd size 2
set pool 0 size to 2
[root@ceph-mon ceph-cluster]# ceph osd pool set rbd min_size 2
set pool 0 min_size to 2
[root@ceph-mon ceph-cluster]# ceph osd pool set rbd pg_num 128
set pool 0 pg_num to 256
[root@ceph-mon ceph-cluster]# ceph osd pool set rbd pgp_num 128
set pool 0 pgp_num to 256
[root@ceph-mon ceph-cluster]# ceph health
HEALTH_OK
如果pool设置失败的话可以删除掉重新添加即可。
更详细一点:
[root@master ceph-cluster]# ceph -s
cluster 38a7726b-6018-41f4-83c2-911b325116df
health HEALTH_OK
monmap e1: 1 mons at {ceph-mon=192.168.128.131:6789/0}
election epoch 2, quorum 0 ceph-mon
osdmap e46: 8 osds: 8 up, 8 in
pgmap v72: 256 pgs, 1 pools, 0 bytes data, 0 objects
276 MB used, 159 GB / 159 GB avail
256 active+clean
如果操作没有问题的话记得把上面操作写到ceph.conf文件里,并同步部署的各节点:
[root@ceph-mon ceph-cluster]# echo "osd pool default size = 2" >> ~/ceph-cluster/ceph.conf
[root@ceph-mon ceph-cluster]# echo "osd pool default min size = 2" >> ~/ceph-cluster/ceph.conf
[root@ceph-mon ceph-cluster]# echo "osd pool default pg num = 256" >> ~/ceph-cluster/ceph.conf
[root@ceph-mon ceph-cluster]# echo "osd pool default pgp num = 256" >> ~/ceph-cluster/ceph.conf
[root@ceph-mon ceph-cluster]# ceph-deploy --overwrite-conf admin master slave1 slave2
如果一切可以重来
部署过程中如果出现任何奇怪的问题无法解决,可以简单的删除一切从头再来:
[root@ceph-mon ceph-cluster]# ceph-deploy purge ceph-mon ceph-osd1 ceph-osd2
[root@ceph-mon ceph-cluster]# ceph-deploy purgedata ceph-mon ceph-osd1 ceph-osd2
[root@ceph-mon ceph-cluster]# ceph-deploy forgetkeys
Troubelshooting
如果出现任何网络问题,首先确认节点可以互相无密码ssh,各个节点的防火墙已关闭或加入规则