CEPH搭建错误处理

错误处理

1, 对硬盘进行格式化: # mkfs.xfs /dev/sdb1, 系统显示:

 mkfs.xfs error: command not found.

因为系统缺少xfs的部分包,安装解决:

 apt-get -y install xfsprogs

2,安装集群时候,log显示:

Reading package lists... Done E: Problem executing scripts APT::Update::Post-Invoke-Success
'if /usr/bin/test -w /var/cache/app-info -a -e /usr/bin/appstreamcli; then appstreamcli refresh > /dev/null; fi' E: Sub-process returned an error code

系统libappstream3版本老久,解决:

sudo pkill -KILL appstreamcli

wget -P /tmp https://launchpad.net/ubuntu/+archive/primary/+files/appstream_0.9.4-1ubuntu1_amd64.deb https://launchpad.net/ubuntu/+archive/primary/+files/libappstream3_0.9.4-1ubuntu1_amd64.deb

 sudo dpkg -i /tmp/appstream_0.9.4-1ubuntu1_amd64.deb /tmp/libappstream3_0.9.4-1ubuntu1_amd64.deb

3,install安装缓慢,前期解决,后期换源:

export CEPH_DEPLOY_REPO_URL=http://mirrors.163.com/ceph/debian-jewel 

export CEPH_DEPLOY_GPG_URL=http://mirrors.163.com/ceph/keys/release.asc

4,在执行安装激活的时候报如下错误:

[ceph_deploy][ERROR ] ExecutableNotFound: Could not locate executable 'ceph-volume' make sure it is installed and available on .........

这个是因为ceph-Deploy的版本高了,需要卸载高版本,安装低版本(admin节点):

pip uninstall ceph-deploy

然后下载1.5版本的gz文件,这里我下载的地址: https://pypi.python.org/pypi/ceph-deploy/1.5.39

然后编译执行setup.py文件。

wget https://files.pythonhosted.org/packages/63/59/c2752952b7867faa2d63ba47c47da96e2f43f5124029975b579020df3665/ceph-deploy-1.5.39.tar.gz

tar -zxvf ceph-deploy-1.5.39.tar.gz

python setup.py build 

python setup.py install

好像如下方法也能解决:

pip install ceph-deploy==1.5.39

5.问题:

[mlv4-VirtualBox][WARNIN] ceph_disk.main.FilesystemTypeError: Cannot discover filesystem type: device /dev/sdb: Line is truncated:
[mlv4-VirtualBox][ERROR ] RuntimeError: command returned non-zero exit status: 1 [ceph_deploy][ERROR ] RuntimeError: Failed to execute command:
/usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdb

首先查看系统状态,我的情况是prepare 之后,已经activate 状态了,可以通过ceph-deploy disk list ceph-osd1 ceph-osd2 查看状态,如果非activate,可以将sdb的用户权限给ceph:

sudo chown ceph:ceph /dev/sdb,

我的问题其实是发现sdb 激活报错,但是 sdb1 也就是 data部分,可以激活,通过:

ceph-deploy osd activate ceph-osd1:/dev/sdb1 ceph-osd2:/dev/sdb1

解决问题。

6.问题:

cephuser@mlv1-VirtualBox:~/cluster$ ceph osd tree 2018-04-11 14:41:40.987041 7f1e42a57700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory 2018-04-11 14:41:40.987511 7f1e42a57700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication 2018-04-11 14:41:40.987619 7f1e42a57700 0 librados: client.admin initialization error (2) No such file or directory Error connecting to cluster: ObjectNotFound

一般这个问题不会出现,但是出现了最简单的解决方法就是把之前cluster下的key文件(所有文件),cp到每一个节点的/etc/ceph文件夹下解决。

7.gateway搭建之后 auth怎么都不过,没找到具体原因,但重装是个好办法,作用于每一个节点安装:

ceph-deploy purge <ceph-node1> [<ceph-node2>]
ceph-deploy purgedata <ceph-node1> [<ceph-node2>]

8.在删除了所有文件之后,集群还占有空间,没及时释放。

官方的解释说,文件对象删除之后,并不会立刻执行GC机制,这个GC机制是可配置的,配置项如下:

"rgw_gc_max_objs": "521",
"rgw_gc_obj_min_wait": "1200",
"rgw_gc_processor_max_time": "600",
"rgw_gc_processor_period": "600",

你可能感兴趣的:(集群,Ceph,分布式)