问题描述
一套ceph生产环境出现了告警,ceph -s查看集群状态后,告警如下:
[root@node1 ~]#
[root@node1 ~]# ceph -s
cluster 1dd73c00-c1ad-4e18-ae80-51bc1fc2e0c5
health HEALTH_WARN
too many PGs per OSD (576 > max 300)
monmap e2: 3 mons at {node1=10.0.10.201:6789/0,node2=10.0.10.202:6789/0,node3=10.0.10.203:6789/0}
election epoch 40, quorum 0,1,2 node1,node2,node3
fsmap e11: 1/1/1 up {0=node2=up:active}, 1 up:standby
osdmap e61: 6 osds: 6 up, 6 in
flags sortbitwise,require_jewel_osds
pgmap v286: 1728 pgs, 5 pools, 16032 bytes data, 20 objects
223 MB used, 91870 MB / 92093 MB avail
1728 active+clean
[root@node1 ~]#
本文旨在记录对“too many PGs per OSD”告警的处理
问题分析
- 从告警来看,它告诉你osds上的pg数太多了,那么我们可以查看一下,集群中每个osd上的分布情况,命令:ceph osd df,输出结果如下:
[root@node1 ~]#
[root@node1 ~]# ceph osd df
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 0.01500 1.00000 15348M 38328k 15311M 0.24 1.01 629
1 0.01500 0.89510 15348M 37632k 15312M 0.24 0.99 473
2 0.01500 0.97151 15348M 38876k 15311M 0.25 1.02 632
3 0.01500 0.98936 15348M 37572k 15312M 0.24 0.99 539
4 0.01500 1.00000 15348M 38608k 15311M 0.25 1.01 640
5 0.01500 0.99223 15348M 37560k 15312M 0.24 0.99 543
0 0.01500 1.00000 15348M 38328k 15311M 0.24 1.01 629
1 0.01500 0.89510 15348M 37632k 15312M 0.24 0.99 473
2 0.01500 0.97151 15348M 38876k 15311M 0.25 1.02 632
3 0.01500 0.98936 15348M 37572k 15312M 0.24 0.99 539
4 0.01500 1.00000 15348M 38608k 15311M 0.25 1.01 640
5 0.01500 0.99223 15348M 37560k 15312M 0.24 0.99 543
0 0.01500 1.00000 15348M 38328k 15311M 0.24 1.01 629
1 0.01500 0.89510 15348M 37632k 15312M 0.24 0.99 473
2 0.01500 0.97151 15348M 38876k 15311M 0.25 1.02 632
3 0.01500 0.98936 15348M 37572k 15312M 0.24 0.99 539
4 0.01500 1.00000 15348M 38608k 15311M 0.25 1.01 640
5 0.01500 0.99223 15348M 37560k 15312M 0.24 0.99 543
0 0.01500 1.00000 15348M 38328k 15311M 0.24 1.01 629
1 0.01500 0.89510 15348M 37632k 15312M 0.24 0.99 473
2 0.01500 0.97151 15348M 38876k 15311M 0.25 1.02 632
3 0.01500 0.98936 15348M 37572k 15312M 0.24 0.99 539
4 0.01500 1.00000 15348M 38608k 15311M 0.25 1.01 640
5 0.01500 0.99223 15348M 37560k 15312M 0.24 0.99 543
0 0.01500 1.00000 15348M 38328k 15311M 0.24 1.01 629
2 0.01500 0.97151 15348M 38876k 15311M 0.25 1.02 632
4 0.01500 1.00000 15348M 38608k 15311M 0.25 1.01 640
0 0.01500 1.00000 15348M 38328k 15311M 0.24 1.01 629
1 0.01500 0.89510 15348M 37632k 15312M 0.24 0.99 473
2 0.01500 0.97151 15348M 38876k 15311M 0.25 1.02 632
3 0.01500 0.98936 15348M 37572k 15312M 0.24 0.99 539
4 0.01500 1.00000 15348M 38608k 15311M 0.25 1.01 640
5 0.01500 0.99223 15348M 37560k 15312M 0.24 0.99 543
TOTAL 92093M 223M 91870M 0.24
MIN/MAX VAR: 0.99/1.02 STDDEV: 0.00
[root@node1 ~]#
观察结果的最后一列,可发现pgs确实过多
- 从告警来看,你会发现告警后面有个>300,此时我们就会有疑问为啥会报错(...>300)呢?这300是怎么来的呢?怎么不是其他的数呢?通过查阅官方资料发现这个300是由参数mon_pg_warn_max_per_osd设定的,mon_pg_warn_max_per_osd默认值为300. 我们也可以验证一下环境中这个参数值到底为多少,命令ceph --show-config | grep "mon_pg_warn_max_per_osd",输出结果如下:
[root@node1 ~]# ceph version
ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
[root@node1 ~]#
[root@node1 ~]# ceph --show-config | grep "mon_pg_warn_max_per_osd"
mon_pg_warn_max_per_osd = 300
[root@node1 ~]#
[root@node1 ~]#
- 下面我们就可以通过调整这个值来解决这个告警
解决问题
1.修改ceph.conf文件,将mon_pg_warn_max_per_osd重新设置一个值,注意mon_pg_warn_max_per_osd放在[global]下
2.将修改push到集群中其他节点,命令:
ceph-deploy --overwrite-conf config push {主机名}
3.重启ceph-mon服务使参数修改生效,命令:
systemctl restart ceph-mon.target
4.检测问题是否解决(检测设置的值是否生效),命令:
ceph --show-config | grep "mon_pg_warn_max_per_osd"
具体实践如下:
[root@node1 ceph]# ls
ceph.client.admin.keyring ceph.conf ceph.conf_back ceph-deploy-ceph.log rbdmap tmp_BvVxU
[root@node1 ceph]#
[root@node1 ceph]# vim ceph.conf
[root@node1 ceph]#
[root@node1 ceph]# ceph-deploy --overwrite-conf config push node2 node3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.36): /usr/bin/ceph-deploy --overwrite-conf config push node2 node3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : True
[ceph_deploy.cli][INFO ] subcommand : push
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] client : ['node2', 'node3']
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.config][DEBUG ] Pushing config to node2
[node2][DEBUG ] connected to host: node2
[node2][DEBUG ] detect platform information from remote host
[node2][DEBUG ] detect machine type
[node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to node3
[node3][DEBUG ] connected to host: node3
[node3][DEBUG ] detect platform information from remote host
[node3][DEBUG ] detect machine type
[node3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[root@node1 ceph]# systemctl restart ceph-mon.target
[root@node1 ceph]#
[root@node1 ceph]# ceph --show-config | grep "mon_pg_warn_max_per_osd"
mon_pg_warn_max_per_osd = 800
[root@node1 ceph]#
[root@node1 ~]# ceph -s
cluster 1dd73c00-c1ad-4e18-ae80-51bc1fc2e0c5
health HEALTH_OK
monmap e2: 3 mons at {node1=10.0.10.201:6789/0,node2=10.0.10.202:6789/0,node3=10.0.10.203:6789/0}
election epoch 60, quorum 0,1,2 node1,node2,node3
fsmap e11: 1/1/1 up {0=node2=up:active}, 1 up:standby
osdmap e61: 6 osds: 6 up, 6 in
flags sortbitwise,require_jewel_osds
pgmap v291: 1728 pgs, 5 pools, 16032 bytes data, 20 objects
223 MB used, 91870 MB / 92093 MB avail
1728 active+clean
[root@node1 ~]#
值得一提
自 ceph版本Luminous v12.2.x以后,参数mon_pg_warn_max_per_osd变更为mon_max_pg_per_osd,默认值也从300变更为200,修改该参数后,也由原来的重启ceph-mon服务变为重启ceph-mgr服务。
修复步骤为:
1.修改ceph.conf文件,将mon_max_pg_per_osd设置一个值,注意mon_max_pg_per_osd放在[global]下
2.将修改push到集群中其他节点,命令:
ceph-deploy --overwrite-conf config push {主机名}
3.重启ceph-mgr服务使参数修改生效,命令:
systemctl restart ceph-mgr.target
4.检测问题是否解决(查看参数是否为ceph,conf里修改的参数),命令:
ceph --show-config | grep "mon_max_pg_per_osd"
实践如下:
- 在ceph Luminous v12.2.8版本模拟告警场景,告警如:
[root@node81 ~]#
[root@node81 ~]# ceph -s
cluster:
id: 3b7d0bb9-83b3-431c-8a00-77b34e3ffb08
health: HEALTH_WARN
too many PGs per OSD (228 > max 200)
services:
mon: 3 daemons, quorum node81,node82,node85
mgr: node81(active), standbys: node85, node82
osd: 6 osds: 6 up, 6 in
data:
pools: 8 pools, 680 pgs
objects: 0 objects, 0B
usage: 6.09GiB used, 29.9GiB / 36.0GiB avail
pgs: 680 active+clean
[root@node81 ~]# ceph --show-config | grep "mon_max_pg_per_osd"
mon_max_pg_per_osd = 200
[root@node81 ~]#
- 查询ceph版本及mon_max_pg_per_osd默认值:
[root@node81 ceph]# ceph version
ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)
[root@node81 ceph]#
[root@node81 ceph]#
[root@node81 ceph]# ceph --show-config | grep "mon_max_pg_per_osd"
mon_max_pg_per_osd = 200
[root@node81 ceph]#
[root@node81 ceph]#
- 修复
[root@node81 ceph]#
[root@node81 ceph]# cp ceph.conf ceph.conf_back
[root@node81 ceph]#
[root@node81 ceph]# vim ceph.conf
[root@node81 ceph]#
[root@node81 ceph]# ceph-deploy --overwrite-conf config push node82 node85
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.0): /usr/bin/ceph-deploy --overwrite-conf config push node82 node85
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : True
[ceph_deploy.cli][INFO ] subcommand : push
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] client : ['node82', 'node85']
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.config][DEBUG ] Pushing config to node82
[node82][DEBUG ] connected to host: node82
[node82][DEBUG ] detect platform information from remote host
[node82][DEBUG ] detect machine type
[node82][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to node85
[node85][DEBUG ] connected to host: node85
[node85][DEBUG ] detect platform information from remote host
[node85][DEBUG ] detect machine type
[node85][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[root@node81 ceph]#
[root@node81 ceph]# systemctl restart ceph-mgr.target
[root@node81 ceph]#
[root@node81 ceph]# ceph --show-config | grep "mon_max_pg_per_osd"
mon_max_pg_per_osd = 800
[root@node81 ceph]#
- 检测集群状态
[root@node81 ceph]# systemctl restart ceph-mgr.target
[root@node81 ceph]#
[root@node81 ceph]# ceph --show-config | grep "mon_max_pg_per_osd"
mon_max_pg_per_osd = 800
[root@node81 ceph]#
[root@node81 ceph]# ceph -s
cluster:
id: 3b7d0bb9-83b3-431c-8a00-77b34e3ffb08
health: HEALTH_OK
services:
mon: 3 daemons, quorum node81,node82,node85
mgr: node81(active), standbys: node85, node82
osd: 6 osds: 6 up, 6 in
data:
pools: 8 pools, 680 pgs
objects: 0 objects, 0B
usage: 6.09GiB used, 29.9GiB / 36.0GiB avail
pgs: 680 active+clean
[root@node81 ceph]#
参考
ceph Luminous 变更