近期在linux上搭建了用于分布式存储的----GlusterFS和Ceph这两个开源的分布式文件系统。
前言----大家可以去github上搜索一下,看源码或者官方文档介绍,更多的去了解Ceph,在这里我就不一一的去介绍原理以及抽象技术层面的基础知识。下面我就搭建部署过程中遇到的问题,向大家做一个介绍及部署过程的详细流程。同时,也希望研究或者喜好这方面的同学,带有生产环境成熟方案或者意见的,也请在这里互相讨论,我们一起共同进步,感兴趣的同学可以加我微信(请备注请求与来源),互相沟通交流。
下面开始Ceph的详细介绍:遇到的问题也会在过程中解决,并且特别说明的是,安装中没有特别说明的,都是安装在同一台admin-node上的配置。
1.首先要查看你的网络是否与外网连通,好多都是虚拟机,需要配置代理,代理一定要配置在
/etc/environment 中,否则不起作用。或者在yum.conf中配置代理也可以。我选在/etc/environment中配置。
2.查看host及hostname名称
[root@vm-10-112-178-135 gadmin]# vi /etc/hosts
[root@vm-10-112-178-135 gadmin]# vi /etc/hostname
3.安装epel
[root@vm-10-112-178-135 gadmin]# yum install -y epel-release
已加载插件:fastestmirrorLoading mirror speeds from cached hostfile正在解决依赖关系--> 正在检查事务---> 软件包 epel-release.noarch.0.7-9 将被 安装--> 解决依赖关系完成依赖关系解决============================================================================================================================================================================================================================================ Package 架构 版本 源 大小============================================================================================================================================================================================================================================正在安装: epel-release noarch 7-9 epel 14 k事务概要============================================================================================================================================================================================================================================安装 1 软件包总下载量:14 k安装大小:24 kDownloading packages:警告:/var/cache/yum/x86_64/7/epel/packages/epel-release-7-9.noarch.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID 352c64e5: NOKEYepel-release-7-9.noarch.rpm 的公钥尚未安装epel-release-7-9.noarch.rpm | 14 kB 00:00:00从 http://file.idc.pub/os/epel/RPM-GPG-KEY-EPEL-7 检索密钥导入 GPG key 0x352C64E5: 用户ID : "Fedora EPEL (7)"
指纹 : 91e9 7d7c 4a5e 96f1 7f3e 888f 6a2f aea2 352c 64e5
来自 : http://file.idc.pub/os/epel/RPM-GPG-KEY-EPEL-7
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
正在安装 : epel-release-7-9.noarch 验证中 : epel-release-7-9.noarch 1/1
已安装:
epel-release.noarch 0:7-9
完毕!
[root@vm-10-112-178-135 gadmin]# yum info epel-release
已加载插件:fastestmirror
Repository epel-testing is listed more than once in the configuration
Repository epel-testing-debuginfo is listed more than once in the configuration
Repository epel-testing-source is listed more than once in the configuration
Repository epel is listed more than once in the configuration
Repository epel-debuginfo is listed more than once in the configuration
Repository epel-source is listed more than once in the configuration
Loading mirror speeds from cached hostfile
已安装的软件包
名称 :epel-release
架构 :noarch
版本 :7
发布 :9
大小 :24 k
源 :installed
来自源:epel
简介 : Extra Packages for Enterprise Linux repository configuration
网址 :http://download.fedoraproject.org/pub/epel
协议 : GPLv2
描述 : This package contains the Extra Packages for Enterprise Linux (EPEL) repository
: GPG key as well as configuration for yum.
4.手动添加ceph.repo 的 yum资源库文件,下载ceph包文件从这里面下载
[root@vm-10-112-178-135 gadmin]# vi /etc/yum.repos.d/ceph.repo
文件内容,将以下文件内容复制到文件中:
[ceph]
name=Ceph packages for $basearch
baseurl=http://download.ceph.com/rpm-jewel/el7/$basearch
enabled=1
priority=1
gpgcheck=1
type=rpm-md
gpgkey=http://download.ceph.com/keys/release.asc
[ceph-noarch]
name=Ceph noarch packages
baseurl=http://download.ceph.com/rpm-jewel/el7/noarch
enabled=1
priority=1
gpgcheck=1
type=rpm-md
gpgkey=http://download.ceph.com/keys/release.asc
[ceph-x86_64]
name=Ceph x86_64 packages
baseurl=http://download.ceph.com/rpm-jewel/el7/x86_64
enabled=0
priority=1
gpgcheck=1
type=rpm-md
gpgkey=http://download.ceph.com/keys/release.asc
[ceph-aarch64]
name=Ceph source packages
baseurl=http://download.ceph.com/rpm-jewel/el7/aarch64
enabled=0
priority=1
gpgcheck=1
type=rpm-md
gpgkey=http://download.ceph.com/keys/release.asc
[ceph-source]
name=Ceph source packages
baseurl=http://download.ceph.com/rpm-jewel/el7/SRPMS
enabled=0
priority=1
gpgcheck=1
type=rpm-md
gpgkey=http://download.ceph.com/keys/release.asc
[apache2-ceph-noarch]
name=Apache noarch packages for Ceph
baseurl=http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ref/master/SRPMS
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=http://download.ceph.com/keys/autobuild.asc
[apache2-ceph-source]
name=Apache source packages for Ceph
baseurl=http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ref/master/SRPMS
enabled=0
priority=2
gpgcheck=1
type=rpm-md
gpgkey=http://download.ceph.com/keys/autobuild.asc
5.安装ceph-deploy工具
该工具是python写的一个官方的ceph集群安装工具。
[root@vm-10-112-178-135 gadmin]#sudo yum update && sudo yum install ceph-deploy
6.查看是否安装了NTP服务,用来校准时间,因为每台机器通信,保证时间统一
7.安装openssh-server,配置ssh免密码连接。注意是在所有节点上。
在所有的节点上安装openssh-server
[root@vm-10-112-178-135 gadmin]# sudo yum install openssh-server
[root@vm-10-112-178-135 gadmin]#
添加用户名
[root@vm-10-112-178-135 gadmin]# sudo useradd -d /home/cephadmin -m cephadmin
[root@vm-10-112-178-135 gadmin]# passwd cephadmin
设置用户拥有root权限的免密码权限
[root@vm-10-112-178-135 gadmin]# echo "cephadmin ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephadmin
cephadmin ALL = (root) NOPASSWD:ALL
[root@vm-10-112-178-135 gadmin]# sudo chmod 0440 /etc/sudoers.d/cephadmin
[cephadmin@vm-10-112-178-135 ~]$ ssh-keygen
[cephadmin@vm-10-112-178-135 .ssh]$ cd ~/.ssh/
[cephadmin@vm-10-112-178-135 .ssh]$ ll
总用量 8
-rw------- 1 cephadmin cephadmin 1679 7月 5 20:19 id_rsa
-rw-r--r-- 1 cephadmin cephadmin 409 7月 5 20:19 id_rsa.pub
[cephadmin@vm-10-112-178-135 .ssh]$ vi config
Copy the key to each Ceph Node, replacing {username} with the user name you created with Create a Ceph Deploy User.
ssh-copy-id {username}@node1
ssh-copy-id {username}@node2
ssh-copy-id {username}@node3
(Recommended) Modify the ~/.ssh/config file of your ceph-deploy admin node so that ceph-deploy can log in to Ceph nodes as the user you created without requiring you to specify --username {username} each time you execute ceph-deploy. This has the added benefit of streamlining ssh and scp usage. Replace {username} with the user name you created:
Host node1
Hostname node1
User {username}
Host node2
Hostname node2
User {username}
Host node3
Hostname node3
User {username}
Enable Networking On Bootup
Ceph OSDs peer with each other and report to Ceph Monitors over the network. If networking is off by default, the Ceph cluster cannot come online during bootup until you enable networking.
The default configuration on some distributions (e.g., CentOS) has the networking interface(s) off by default. Ensure that, during boot up, your network interface(s) turn(s) on so that your Ceph daemons can communicate over the network. For example, on Red Hat and CentOS, navigate to /etc/sysconfig/network-scripts and ensure that the ifcfg-{iface} file has ONBOOT set to yes.
[cephadmin@vm-10-112-178-135 my-cluster]$ cd ~/.ssh/
[cephadmin@vm-10-112-178-135 .ssh]$ ll
总用量 12
-rw-rw-r-- 1 cephadmin cephadmin 171 7月 5 20:26 config
-rw------- 1 cephadmin cephadmin 1679 7月 5 20:19 id_rsa
-rw-r--r-- 1 cephadmin cephadmin 409 7月 5 20:19 id_rsa.pub
[cephadmin@vm-10-112-178-135 .ssh]$ vi config
[cephadmin@vm-10-112-178-135 .ssh]$ cat config
Host 10.112.178.141
Hostname vm-10-112-178-141
User cephadmin
Host 10.112.178.142
Hostname vm-10-112-178-142
User cephadmin
Host 10.112.178.143
Hostname vm-10-112-178-143
User cephadmin
[cephadmin@vm-10-112-178-135 .ssh]$
[cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub cephadmin@vm-10-112-178-141
The authenticity of host 'vm-10-112-178-141 (10.112.178.141)' can't be established.
ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
cephadmin@vm-10-112-178-141's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'cephadmin@vm-10-112-178-141'"
and check to make sure that only the key(s) you wanted were added.
[cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub cephadmin@vm-10-112-178-142
The authenticity of host 'vm-10-112-178-142 (10.112.178.142)' can't be established.
ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
cephadmin@vm-10-112-178-142's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'cephadmin@vm-10-112-178-142'"
and check to make sure that only the key(s) you wanted were added.
[cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub cephadmin@vm-10-112-178-143
The authenticity of host 'vm-10-112-178-143 (10.112.178.143)' can't be established.
ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
cephadmin@vm-10-112-178-143's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'cephadmin@vm-10-112-178-143'"
and check to make sure that only the key(s) you wanted were added.
[cephadmin@vm-10-112-178-135 .ssh]$ ssh [email protected]
Bad owner or permissions on /home/cephadmin/.ssh/config
[cephadmin@vm-10-112-178-135 .ssh]$ sudo firewall-cmd --zone=public --add-service=ceph-mon --permanent
FirewallD is not running
说明防火墙已经关闭了,这样就不用考率防火墙了。
[cephadmin@vm-10-112-178-135 ~]$ mkdir my-cluster
[cephadmin@vm-10-112-178-135 ~]$ cd my-cluster/
[cephadmin@vm-10-112-178-135 my-cluster]$
[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy new vm-10-112-178-135
[vm-10-112-178-135][DEBUG ] 总下载量:59 M[vm-10-112-178-135][DEBUG ] 安装大小:218 M[vm-10-112-178-135][DEBUG ] Downloading packages:[vm-10-112-178-135][WARNIN] No data was received after 300 seconds, disconnecting...[vm-10-112-178-135][INFO ] Running command: sudo ceph --version[vm-10-112-178-135][ERROR ] Traceback (most recent call last):[vm-10-112-178-135][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/process.py", line 119, in run[vm-10-112-178-135][ERROR ] reporting(conn, result, timeout)[vm-10-112-178-135][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/log.py", line 13, in reporting[vm-10-112-178-135][ERROR ] received = result.receive(timeout)[vm-10-112-178-135][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 704, in receive[vm-10-112-178-135][ERROR ] raise self._getremoteerror() or EOFError()[vm-10-112-178-135][ERROR ] RemoteError: Traceback (most recent call last):[vm-10-112-178-135][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 1036, in executetask[vm-10-112-178-135][ERROR ] function(channel, **kwargs)[vm-10-112-178-135][ERROR ] File "", line 12, in _remote_run
[vm-10-112-178-135][ERROR ] File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
[vm-10-112-178-135][ERROR ] errread, errwrite)
[vm-10-112-178-135][ERROR ] File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
[vm-10-112-178-135][ERROR ] raise child_exception
[vm-10-112-178-135][ERROR ] OSError: [Errno 2] No such file or directory
[vm-10-112-178-135][ERROR ]
[vm-10-112-178-135][ERROR ]
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: ceph --version
原因是网络比较慢,达到5分钟超时
解决方案:1.可以在每个节点上先行安装sudo yum -y install ceph
2.数量比较多的话多执行几次此命令
3.最佳方案是搭建本地源
[cephadmin@vm-10-112-178-135 my-cluster]$ sudo yum install -y ceph
[cephadmin@vm-10-112-178-135 my-cluster]$ sudo yum -y install ceph-radosgw
[ceph_deploy.cli][INFO ] dev : master
[ceph_deploy.cli][INFO ] nogpgcheck : False
[ceph_deploy.cli][INFO ] local_mirror : None
[ceph_deploy.cli][INFO ] release : None
[ceph_deploy.cli][INFO ] install_mon : False
[ceph_deploy.cli][INFO ] gpg_url : None
[ceph_deploy.install][DEBUG ] Installing stable version jewel on cluster ceph hosts vm-10-112-178-141
[ceph_deploy.install][DEBUG ] Detecting platform for host vm-10-112-178-141 ...
[vm-10-112-178-141][DEBUG ] connection detected need for sudo
sudo:抱歉,您必须拥有一个终端来执行 sudo
[ceph_deploy][ERROR ] RuntimeError: connecting to host: vm-10-112-178-141 resulted in errors: IOError cannot send (already closed?)
[cephadmin@vm-10-112-178-135 my-cluster]$
解决方案:
[摘要] Linux ssh执行远端服务器sudo命令时有如下报错:
sudo: sorry, you must have a tty to run sudo
sudo:抱歉,您必须拥有一个终端来执行 sudo
网上搜了一下,解决办法是编辑 /etc/sudoers 文件,将Default requiretty注释掉。
sudo vi /etc/sudoers
#Default requiretty #注释掉 Default requiretty 一行
具体操作:
sudo sed -i 's/Defaults requiretty/#Defaults requiretty/g' /etc/sudoers
sudo cat /etc/sudoers | grep requiretty
-----由于http代理网速太慢,只能选择分别再每台虚拟机节点上 手动安装 ceph,安装指令如下----
sudo yum -y install ceph
sudo yum -y install ceph-radosgw
[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy install vm-10-112-178-135 vm-10-112-178-141 vm-10-112-178-142 vm-10-112-178-143
ceph-deploy install {ceph-node}[{ceph-node} ...] --local-mirror=/opt/ceph-repo --no-adjust-repos --release=jewel
[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy osd prepare vm-10-112-178-142:/var/local/osd0 vm-10-112-178-143:/var/local/osd1
[vm-10-112-178-142][WARNIN] 2017-07-06 16:19:33.928816 7f2940a11800 -1 ** ERROR: error creating empty object store in /var/local/osd0: (13) Permission denied
[vm-10-112-178-142][WARNIN]
[vm-10-112-178-142][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /var/local/osd0
解决方案:
因为机器上的 /var/local/osd0 与 /var/local/osd1 这两个文件夹没有权限,使用 chmod 777 对文件夹 进行赋予 权限。
[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy osd activate vm-10-112-178-142:/var/local/osd0 vm-10-112-178-143:/var/local/osd1