Linux搭建Ceph集群--详细流程介绍

近期在linux上搭建了用于分布式存储的----GlusterFS和Ceph这两个开源的分布式文件系统。

前言----大家可以去github上搜索一下,看源码或者官方文档介绍,更多的去了解Ceph,在这里我就不一一的去介绍原理以及抽象技术层面的基础知识。下面我就搭建部署过程中遇到的问题,向大家做一个介绍及部署过程的详细流程。同时,也希望研究或者喜好这方面的同学,带有生产环境成熟方案或者意见的,也请在这里互相讨论,我们一起共同进步,感兴趣的同学可以加我微信(请备注请求与来源),互相沟通交流。

下面开始Ceph的详细介绍:遇到的问题也会在过程中解决,并且特别说明的是,安装中没有特别说明的,都是安装在同一台admin-node上的配置。

1.首先要查看你的网络是否与外网连通,好多都是虚拟机,需要配置代理,代理一定要配置在

/etc/environment 中,否则不起作用。或者在yum.conf中配置代理也可以。我选在/etc/environment中配置。

2.查看host及hostname名称

[root@vm-10-112-178-135 gadmin]# vi /etc/hosts

[root@vm-10-112-178-135 gadmin]# vi /etc/hostname

3.安装epel

[root@vm-10-112-178-135 gadmin]# yum install -y epel-release

已加载插件:fastestmirrorLoading mirror speeds from cached hostfile正在解决依赖关系--> 正在检查事务---> 软件包 epel-release.noarch.0.7-9 将被 安装--> 解决依赖关系完成依赖关系解决============================================================================================================================================================================================================================================ Package                                                        架构                                                    版本                                                  源                                                      大小============================================================================================================================================================================================================================================正在安装: epel-release                                                  noarch                                                  7-9                                                  epel                                                    14 k事务概要============================================================================================================================================================================================================================================安装  1 软件包总下载量:14 k安装大小:24 kDownloading packages:警告:/var/cache/yum/x86_64/7/epel/packages/epel-release-7-9.noarch.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID 352c64e5: NOKEYepel-release-7-9.noarch.rpm 的公钥尚未安装epel-release-7-9.noarch.rpm                                                                                                                                                                                          |  14 kB  00:00:00从 http://file.idc.pub/os/epel/RPM-GPG-KEY-EPEL-7 检索密钥导入 GPG key 0x352C64E5: 用户ID    : "Fedora EPEL (7)"

指纹      : 91e9 7d7c 4a5e 96f1 7f3e 888f 6a2f aea2 352c 64e5

来自      : http://file.idc.pub/os/epel/RPM-GPG-KEY-EPEL-7

Running transaction check

Running transaction test

Transaction test succeeded

Running transaction

正在安装    : epel-release-7-9.noarch                                                                                                        验证中      : epel-release-7-9.noarch                                                                                                                                                                                                1/1

已安装:

epel-release.noarch 0:7-9

完毕!

[root@vm-10-112-178-135 gadmin]# yum info epel-release

已加载插件:fastestmirror

Repository epel-testing is listed more than once in the configuration

Repository epel-testing-debuginfo is listed more than once in the configuration

Repository epel-testing-source is listed more than once in the configuration

Repository epel is listed more than once in the configuration

Repository epel-debuginfo is listed more than once in the configuration

Repository epel-source is listed more than once in the configuration

Loading mirror speeds from cached hostfile

已安装的软件包

名称    :epel-release

架构    :noarch

版本    :7

发布    :9

大小    :24 k

源    :installed

来自源:epel

简介    : Extra Packages for Enterprise Linux repository configuration

网址    :http://download.fedoraproject.org/pub/epel

协议    : GPLv2

描述    : This package contains the Extra Packages for Enterprise Linux (EPEL) repository

: GPG key as well as configuration for yum.

4.手动添加ceph.repo 的 yum资源库文件,下载ceph包文件从这里面下载

[root@vm-10-112-178-135 gadmin]# vi /etc/yum.repos.d/ceph.repo

文件内容,将以下文件内容复制到文件中:

[ceph]

name=Ceph packages for $basearch

baseurl=http://download.ceph.com/rpm-jewel/el7/$basearch

enabled=1

priority=1

gpgcheck=1

type=rpm-md

gpgkey=http://download.ceph.com/keys/release.asc

[ceph-noarch]

name=Ceph noarch packages

baseurl=http://download.ceph.com/rpm-jewel/el7/noarch

enabled=1

priority=1

gpgcheck=1

type=rpm-md

gpgkey=http://download.ceph.com/keys/release.asc

[ceph-x86_64]

name=Ceph x86_64 packages

baseurl=http://download.ceph.com/rpm-jewel/el7/x86_64

enabled=0

priority=1

gpgcheck=1

type=rpm-md

gpgkey=http://download.ceph.com/keys/release.asc

[ceph-aarch64]

name=Ceph source packages

baseurl=http://download.ceph.com/rpm-jewel/el7/aarch64

enabled=0

priority=1

gpgcheck=1

type=rpm-md

gpgkey=http://download.ceph.com/keys/release.asc

[ceph-source]

name=Ceph source packages

baseurl=http://download.ceph.com/rpm-jewel/el7/SRPMS

enabled=0

priority=1

gpgcheck=1

type=rpm-md

gpgkey=http://download.ceph.com/keys/release.asc

[apache2-ceph-noarch]

name=Apache noarch packages for Ceph

baseurl=http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ref/master/SRPMS

enabled=1

priority=2

gpgcheck=1

type=rpm-md

gpgkey=http://download.ceph.com/keys/autobuild.asc

[apache2-ceph-source]

name=Apache source packages for Ceph

baseurl=http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ref/master/SRPMS

enabled=0

priority=2

gpgcheck=1

type=rpm-md

gpgkey=http://download.ceph.com/keys/autobuild.asc

5.安装ceph-deploy工具

该工具是python写的一个官方的ceph集群安装工具。

[root@vm-10-112-178-135 gadmin]#sudo yum update && sudo yum install ceph-deploy

6.查看是否安装了NTP服务,用来校准时间,因为每台机器通信,保证时间统一

7.安装openssh-server,配置ssh免密码连接。注意是在所有节点上。

在所有的节点上安装openssh-server

[root@vm-10-112-178-135 gadmin]# sudo yum install openssh-server

[root@vm-10-112-178-135 gadmin]#

添加用户名

[root@vm-10-112-178-135 gadmin]# sudo useradd -d /home/cephadmin -m cephadmin

[root@vm-10-112-178-135 gadmin]# passwd cephadmin

设置用户拥有root权限的免密码权限

[root@vm-10-112-178-135 gadmin]# echo "cephadmin ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephadmin

cephadmin ALL = (root) NOPASSWD:ALL

[root@vm-10-112-178-135 gadmin]# sudo chmod 0440 /etc/sudoers.d/cephadmin

[cephadmin@vm-10-112-178-135 ~]$ ssh-keygen

[cephadmin@vm-10-112-178-135 .ssh]$ cd ~/.ssh/

[cephadmin@vm-10-112-178-135 .ssh]$ ll

总用量 8

-rw------- 1 cephadmin cephadmin 1679 7月  5 20:19 id_rsa

-rw-r--r-- 1 cephadmin cephadmin  409 7月  5 20:19 id_rsa.pub

[cephadmin@vm-10-112-178-135 .ssh]$ vi config

Copy the key to each Ceph Node, replacing {username} with the user name you created with Create a Ceph Deploy User.

ssh-copy-id {username}@node1

ssh-copy-id {username}@node2

ssh-copy-id {username}@node3

(Recommended) Modify the ~/.ssh/config file of your ceph-deploy admin node so that ceph-deploy can log in to Ceph nodes as the user you created without requiring you to specify --username {username} each time you execute ceph-deploy. This has the added benefit of streamlining ssh and scp usage. Replace {username} with the user name you created:

Host node1

Hostname node1

User {username}

Host node2

Hostname node2

User {username}

Host node3

Hostname node3

User {username}

Enable Networking On Bootup

Ceph OSDs peer with each other and report to Ceph Monitors over the network. If networking is off by default, the Ceph cluster cannot come online during bootup until you enable networking.

The default configuration on some distributions (e.g., CentOS) has the networking interface(s) off by default. Ensure that, during boot up, your network interface(s) turn(s) on so that your Ceph daemons can communicate over the network. For example, on Red Hat and CentOS, navigate to /etc/sysconfig/network-scripts and ensure that the ifcfg-{iface} file has ONBOOT set to yes.

[cephadmin@vm-10-112-178-135 my-cluster]$ cd ~/.ssh/

[cephadmin@vm-10-112-178-135 .ssh]$ ll

总用量 12

-rw-rw-r-- 1 cephadmin cephadmin  171 7月  5 20:26 config

-rw------- 1 cephadmin cephadmin 1679 7月  5 20:19 id_rsa

-rw-r--r-- 1 cephadmin cephadmin  409 7月  5 20:19 id_rsa.pub

[cephadmin@vm-10-112-178-135 .ssh]$ vi config

[cephadmin@vm-10-112-178-135 .ssh]$ cat config

Host 10.112.178.141

Hostname vm-10-112-178-141

User cephadmin

Host 10.112.178.142

Hostname vm-10-112-178-142

User cephadmin

Host 10.112.178.143

Hostname vm-10-112-178-143

User cephadmin

[cephadmin@vm-10-112-178-135 .ssh]$

[cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub  cephadmin@vm-10-112-178-141

The authenticity of host 'vm-10-112-178-141 (10.112.178.141)' can't be established.

ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.

Are you sure you want to continue connecting (yes/no)? yes

/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

cephadmin@vm-10-112-178-141's password:

Number of key(s) added: 1

Now try logging into the machine, with:  "ssh 'cephadmin@vm-10-112-178-141'"

and check to make sure that only the key(s) you wanted were added.

[cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub  cephadmin@vm-10-112-178-142

The authenticity of host 'vm-10-112-178-142 (10.112.178.142)' can't be established.

ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.

Are you sure you want to continue connecting (yes/no)? yes

/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

cephadmin@vm-10-112-178-142's password:

Number of key(s) added: 1

Now try logging into the machine, with:  "ssh 'cephadmin@vm-10-112-178-142'"

and check to make sure that only the key(s) you wanted were added.

[cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub  cephadmin@vm-10-112-178-143

The authenticity of host 'vm-10-112-178-143 (10.112.178.143)' can't be established.

ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.

Are you sure you want to continue connecting (yes/no)? yes

/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

cephadmin@vm-10-112-178-143's password:

Number of key(s) added: 1

Now try logging into the machine, with:  "ssh 'cephadmin@vm-10-112-178-143'"

and check to make sure that only the key(s) you wanted were added.

[cephadmin@vm-10-112-178-135 .ssh]$ ssh [email protected]

Bad owner or permissions on /home/cephadmin/.ssh/config

[cephadmin@vm-10-112-178-135 .ssh]$ sudo firewall-cmd --zone=public --add-service=ceph-mon --permanent

FirewallD is not running

说明防火墙已经关闭了,这样就不用考率防火墙了。

[cephadmin@vm-10-112-178-135 ~]$ mkdir my-cluster

[cephadmin@vm-10-112-178-135 ~]$ cd my-cluster/

[cephadmin@vm-10-112-178-135 my-cluster]$

[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy new vm-10-112-178-135

[vm-10-112-178-135][DEBUG ] 总下载量:59 M[vm-10-112-178-135][DEBUG ] 安装大小:218 M[vm-10-112-178-135][DEBUG ] Downloading packages:[vm-10-112-178-135][WARNIN] No data was received after 300 seconds, disconnecting...[vm-10-112-178-135][INFO  ] Running command: sudo ceph --version[vm-10-112-178-135][ERROR ] Traceback (most recent call last):[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/process.py", line 119, in run[vm-10-112-178-135][ERROR ]    reporting(conn, result, timeout)[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/log.py", line 13, in reporting[vm-10-112-178-135][ERROR ]    received = result.receive(timeout)[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 704, in receive[vm-10-112-178-135][ERROR ]    raise self._getremoteerror() or EOFError()[vm-10-112-178-135][ERROR ] RemoteError: Traceback (most recent call last):[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 1036, in executetask[vm-10-112-178-135][ERROR ]    function(channel, **kwargs)[vm-10-112-178-135][ERROR ]  File "", line 12, in _remote_run

[vm-10-112-178-135][ERROR ]  File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__

[vm-10-112-178-135][ERROR ]    errread, errwrite)

[vm-10-112-178-135][ERROR ]  File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child

[vm-10-112-178-135][ERROR ]    raise child_exception

[vm-10-112-178-135][ERROR ] OSError: [Errno 2] No such file or directory

[vm-10-112-178-135][ERROR ]

[vm-10-112-178-135][ERROR ]

[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: ceph --version

原因是网络比较慢,达到5分钟超时

解决方案:1.可以在每个节点上先行安装sudo yum -y install ceph

2.数量比较多的话多执行几次此命令

3.最佳方案是搭建本地源

[cephadmin@vm-10-112-178-135 my-cluster]$ sudo yum install -y ceph

[cephadmin@vm-10-112-178-135 my-cluster]$ sudo yum -y install ceph-radosgw

[ceph_deploy.cli][INFO  ]  dev                          : master

[ceph_deploy.cli][INFO  ]  nogpgcheck                    : False

[ceph_deploy.cli][INFO  ]  local_mirror                  : None

[ceph_deploy.cli][INFO  ]  release                      : None

[ceph_deploy.cli][INFO  ]  install_mon                  : False

[ceph_deploy.cli][INFO  ]  gpg_url                      : None

[ceph_deploy.install][DEBUG ] Installing stable version jewel on cluster ceph hosts vm-10-112-178-141

[ceph_deploy.install][DEBUG ] Detecting platform for host vm-10-112-178-141 ...

[vm-10-112-178-141][DEBUG ] connection detected need for sudo

sudo:抱歉,您必须拥有一个终端来执行 sudo

[ceph_deploy][ERROR ] RuntimeError: connecting to host: vm-10-112-178-141 resulted in errors: IOError cannot send (already closed?)

[cephadmin@vm-10-112-178-135 my-cluster]$

解决方案:

[摘要] Linux ssh执行远端服务器sudo命令时有如下报错:

sudo: sorry, you must have a tty to run sudo

sudo:抱歉,您必须拥有一个终端来执行 sudo

网上搜了一下,解决办法是编辑 /etc/sudoers 文件,将Default requiretty注释掉。

sudo vi /etc/sudoers

#Default requiretty #注释掉 Default requiretty 一行

具体操作:

sudo sed -i 's/Defaults    requiretty/#Defaults    requiretty/g' /etc/sudoers

sudo cat /etc/sudoers | grep requiretty

-----由于http代理网速太慢,只能选择分别再每台虚拟机节点上 手动安装 ceph,安装指令如下----

sudo yum -y install ceph

sudo yum -y install ceph-radosgw

[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy install vm-10-112-178-135 vm-10-112-178-141 vm-10-112-178-142 vm-10-112-178-143

ceph-deploy install {ceph-node}[{ceph-node} ...] --local-mirror=/opt/ceph-repo --no-adjust-repos --release=jewel

[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy osd prepare vm-10-112-178-142:/var/local/osd0 vm-10-112-178-143:/var/local/osd1

[vm-10-112-178-142][WARNIN] 2017-07-06 16:19:33.928816 7f2940a11800 -1  ** ERROR: error creating empty object store in /var/local/osd0: (13) Permission denied

[vm-10-112-178-142][WARNIN]

[vm-10-112-178-142][ERROR ] RuntimeError: command returned non-zero exit status: 1

[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /var/local/osd0

解决方案:

因为机器上的 /var/local/osd0 与 /var/local/osd1 这两个文件夹没有权限,使用 chmod 777 对文件夹 进行赋予 权限。

[cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy osd activate vm-10-112-178-142:/var/local/osd0 vm-10-112-178-143:/var/local/osd1

你可能感兴趣的:(Linux搭建Ceph集群--详细流程介绍)