ceph cluster for k8s

背景

考虑到k8s需要一个可共享到各个node的存储,准备搭建一个ceph,以rbd的方式提供

操作系统准备

centos 7.2
centos-base.repo epel.repo
ntp同步配置(chrony.conf)
内网服务器,暂时关闭selinux与防火墙
可选:dns配置,如忽略采用/etc/hosts方式

集群规划

主机名 角色
cloud4ourself-c1 mon
cloud4ourself-c1 osd
cloud4ourself-c1 osd

此处名字应和hostname -s 输出相一致

安装过程

1、创建ceph用户(所有主机)

useradd ceph
echo 'ceph ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
sed -i 's/Defaults requiretty/#Defaults requiretty/' /etc/sudoers

2、所有主机配置ssh免密登录(ceph用户)

ssh-keygen
cat ~/.ssh/id_rsa.pub #将三个服务器的三行文本copy到一个文本文件中
touch ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
vim ~/.ssh/authorized_keys #将上面的三行文本复制到此文件
ssh cloud4ourself-c1
ssh cloud4ourself-c2
ssh cloud4ourself-c3 #都是免登录

3、配置集群(cloud4ourself-c1)

mkdir k8s && cd k8s
sudo yum install http://download.ceph.com/rpm-hammer/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y
sudo yum install ceph-deploy -y
$ceph-deploy new cloud4ourself-c1

[ceph@cloud4ourself-c1 ~]$ ceph-deploy new cloud4ourself-c1

[ceph@cloud4ourself-c1 k8s]$ ls
ceph.conf  ceph.log  ceph.mon.keyring

echo "osd pool default size = 2" >> ceph.conf

4、安装

ceph-deploy install cloud4ourself-c1 cloud4ourself-c2 cloud4ourself-c3
上面是安装最新版本的ceph,如果安装指定版本
ceph-deploy install cloud4ourself-c1 --repo-url=http://mirrors.aliyun.com/ceph/rpm-hammer/el7/
如有如下错误

[c2][WARNIN] Error: Package: 1:ceph-selinux-10.2.6-0.el7.x86_64 (Ceph)
[c2][WARNIN]            Requires: selinux-policy-base >= 3.13.1-102.el7_3.13
[c2][WARNIN]            Installed: selinux-policy-targeted-3.13.1-60.el7.noarch (@anaconda)
[c2][WARNIN]                selinux-policy-base = 3.13.1-60.el7
[c2][WARNIN]            Available: selinux-policy-minimum-3.13.1-102.el7.noarch (base)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7
[c2][WARNIN]            Available: selinux-policy-minimum-3.13.1-102.el7_3.4.noarch (updates)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7_3.4
[c2][WARNIN]            Available: selinux-policy-minimum-3.13.1-102.el7_3.7.noarch (updates)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7_3.7
[c2][WARNIN]            Available: selinux-policy-mls-3.13.1-102.el7.noarch (base)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7
[c2][WARNIN]            Available: selinux-policy-mls-3.13.1-102.el7_3.4.noarch (updates)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7_3.4
[c2][WARNIN]            Available: selinux-policy-mls-3.13.1-102.el7_3.7.noarch (updates)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7_3.7
[c2][WARNIN]            Available: selinux-policy-targeted-3.13.1-102.el7.noarch (base)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7
[c2][WARNIN]            Available: selinux-policy-targeted-3.13.1-102.el7_3.4.noarch (updates)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7_3.4
[c2][WARNIN]            Available: selinux-policy-targeted-3.13.1-102.el7_3.7.noarch (updates)
[c2][WARNIN]                selinux-policy-base = 3.13.1-102.el7_3.7
[c2][DEBUG ]  You could try running: rpm -Va --nofiles --nodigest
[c2][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y install ceph ceph-radosgw

可能是相关repo没有更新。

如有如下错误

[c1][DEBUG ] Retrieving https://download.ceph.com/rpm-jewel/el7/noarch/ceph-release-1-0.el7.noarch.rpm
[c1][WARNIN]    file /etc/yum.repos.d/ceph.repo from install of ceph-release-1-1.el7.noarch conflicts with file from package ceph-release-1-1.el7.noarch
[c1][DEBUG ] Preparing...                          ########################################
[c1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm -Uvh --replacepkgs https://download.ceph.com/rpm-jewel/el7/noarch/ceph-release-1-0.el7.noarch.rpm

可执行sudo rpm -e ceph-release
然后再试。

monitor初始化

[ceph@cloud4ourself-c1 k8s]$ ceph-deploy mon create-initial
[ceph@cloud4ourself-c1 k8s]$ ls 
ceph.bootstrap-mds.keyring  ceph.bootstrap-osd.keyring  ceph.bootstrap-rgw.keyring  ceph.client.admin.keyring  ceph.conf  ceph-deploy-ceph.log  ceph.mon.keyring

cloud4ourself-c2\cloud4ourself-c3主机

sudo mkdir -p /var/local/cephfs
sudo chown ceph:ceph  /var/local/cephfs

cloud4ourself-c1主机
(由于使用了虚拟机,使用xfs文件系统替代磁盘)

ceph-deploy osd prepare  cloud4ourself-c2:/var/local/cephfs cloud4ourself-c3:/var/local/cephfs
ceph-deploy osd activate cloud4ourself-c2:/var/local/cephfs cloud4ourself-c3:/var/local/cephfs
sudo chmod +r /etc/ceph/ceph.client.admin.keyring
ceph -s

后续

增加两个mon
修改ceph.conf

public_network=10.9.5.0/24

ceph-deploy --overwrite-conf mon add cloud4ourself-c2
ceph-deploy --overwrite-conf mon add cloud4ourself-c3

在测试k8s过程中,关闭一个node后,发现存在rbd lock的情况

Mar  8 17:53:18 cloud4ourself-mytest2 kubelet: E0308 17:53:18.391278    2162 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/rbd/a539e906-03d7-11e7-9826-fa163eec323b-pvc-a55f4a04-0317-11e7-9826-fa163eec323b\" (\"a539e906-03d7-11e7-9826-fa163eec323b\")" failed. No retries permitted until 2017-03-08 17:55:18.391251562 +0800 CST (durationBeforeRetry 2m0s). Error: MountVolume.SetUp failed for volume "kubernetes.io/rbd/a539e906-03d7-11e7-9826-fa163eec323b-pvc-a55f4a04-0317-11e7-9826-fa163eec323b" (spec.Name: "pvc-a55f4a04-0317-11e7-9826-fa163eec323b") pod "a539e906-03d7-11e7-9826-fa163eec323b" (UID: "a539e906-03d7-11e7-9826-fa163eec323b") with: rbd: image kubernetes-dynamic-pvc-52f919af-0321-11e7-b778-fa163eec323b is locked by other nodes

解决方式是收到解锁
rbd lock remove

参考链接:
http://docs.ceph.org.cn/start/
http://www.cnblogs.com/clouding/p/6115447.html
http://tonybai.com/2017/02/17/temp-fix-for-pod-unable-mount-cephrbd-volume/
http://tonybai.com/2016/11/07/integrate-kubernetes-with-ceph-rbd/

你可能感兴趣的:(ceph cluster for k8s)