ceph集群jewel版本部署osd激活权限报错-故障排查

转载自:http://blog.51cto.com/michaelkang/1786298

ceph集群jewel版本部署,osd 激活报错:


集群部署过程中执行osd激活操作如下:

sudo ceph-deploy osd activate ceph21:/dev/sdc1:/dev/sdj3


报错内容如下:

[ceph21][WARNIN] got monmap epoch 1

[ceph21][WARNIN] command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 1 --monmap /var/lib/ceph/tmp/mnt.u8rJvJ/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.u8rJvJ --osd-journal /var/lib/ceph/tmp/mnt.u8rJvJ/journal --osd-uuid f7e5f270-5c59-4c22-83cd-bbe191f40b72 --keyring /var/lib/ceph/tmp/mnt.u8rJvJ/keyring --setuser ceph --setgroup ceph

[ceph21][WARNIN] 2016-05-19 15:54:36.539645 7f5eb857b800 -1 filestore(/var/lib/ceph/tmp/mnt.u8rJvJ) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.u8rJvJ/journal: (13) Permission denied

[ceph21][WARNIN] 2016-05-19 15:54:36.539664 7f5eb857b800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13

[ceph21][WARNIN] 2016-05-19 15:54:36.539719 7f5eb857b800 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.u8rJvJ: (13) Permission denied

[ceph21][WARNIN] mount_activate: Failed to activate

[ceph21][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.u8rJvJ

[ceph21][WARNIN] command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.u8rJvJ

[ceph21][WARNIN] Traceback (most recent call last):

[ceph21][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in

[ceph21][WARNIN]     load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()

[ceph21][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4964, in run

[ceph21][WARNIN]     main(sys.argv[1:])

[ceph21][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4915, in main

[ceph21][WARNIN]     args.func(args)

[ceph21][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3269, in main_activate

[ceph21][WARNIN]     reactivate=args.reactivate,

[ceph21][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3026, in mount_activate

[ceph21][WARNIN]     (osd_id, cluster) = activate(path, activate_key_template, init)

[ceph21][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3202, in activate

[ceph21][WARNIN]     keyring=keyring,

[ceph21][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2695, in mkfs

[ceph21][WARNIN]     '--setgroup', get_ceph_group(),

[ceph21][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 439, in command_check_call

[ceph21][WARNIN]     return subprocess.check_call(arguments)

[ceph21][WARNIN]   File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call

[ceph21][WARNIN]     raise CalledProcessError(retcode, cmd)

[ceph21][WARNIN] subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '--mkkey', '-i', '1', '--monmap', '/var/lib/ceph/tmp/mnt.u8rJvJ/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.u8rJvJ', '--osd-journal', '/var/lib/ceph/tmp/mnt.u8rJvJ/journal', '--osd-uuid', 'f7e5f270-5c59-4c22-83cd-bbe191f40b72', '--keyring', '/var/lib/ceph/tmp/mnt.u8rJvJ/keyring', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status 1

[ceph21][ERROR ] RuntimeError: command returned non-zero exit status: 1

[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdb1



解决办法:

解决办法很简单,将ceph集群需要使用的所有磁盘权限,所属用户、用户组改给ceph


chown ceph:ceph /dev/sdd1


问题延伸:

此问题本次修复后,系统重启磁盘权限会被修改回,导致osd服务无法正常启动,这个权限问题很坑,写了个for 循环,加入到rc.local,每次系统启动自动修改磁盘权限;


for i in a b c d e f g h i l j k;do chown ceph.ceph /dev/sd"$i"*;done


查找ceph资料,发现这其实是一个bug,社区暂未解决。

参考信息:

http://tracker.ceph.com/issues/13833


你可能感兴趣的:(Ceph)