初探glusterfs-使用小结FAQ
2017/2/7
注:建议使用3副本(1x3),1x2副本危险,2x2也危险。 一、配置相关 1、快速建立一个卷来提供服务的流程示例 【数据盘分区】 1)ext4格式 ~]# fdisk /dev/sdb <<_EOF n p 1 p w _EOF ~]# mkfs.ext4 /dev/sdb1 ~]# mkdir /data ~]# cat <<_EOF >>/etc/fstab UUID=$(blkid /dev/sdb1 |cut -d'"' -f2) /data ext4 defaults 0 0 _EOF ~]# mount -a 2)xfs格式 如果分区所在设备已经挂载,要先卸载并删掉现有系统。 yum install lvm2 xfsprogs -y pvcreate /dev/sdb vgcreate vg0 /dev/sdb lvcreate -l 100%FREE -n lv01 vg0 mkfs.xfs -f -i size=512 /dev/vg0/lv01 mkdir /data cat <<_EOF >>/etc/fstab UUID=$(blkid /dev/vg0/lv01 |cut -d'"' -f2) /data xfs defaults 0 0 _EOF mount -a # df -h |grep data /dev/mapper/vg0-lv01 16T 33M 16T 1% /data 【配置服务】以在 10.60.200.11 上配置为例 yum install glusterfs-server service glusterd start chkconfig glusterd on 【配置集群】 gluster peer probe 10.60.200.12 gluster peer probe 10.60.200.13 每台集群节点上建立目录 mkdir /data/gv1/brick1 -p 【提供data域】 创建卷gv1作为主数据域: # gluster volume create gv1 replica 3 transport tcp \ 10.60.200.11:/data/gv1/brick1 \ 10.60.200.12:/data/gv1/brick1 \ 10.60.200.13:/data/gv1/brick1 【启动】 # gluster volume start gv1 【查看现状】 # gluster volume info Volume Name: gv1 Type: Replicate Volume ID: 32b1866c-1743-4dd9-9429-6ecfdfa168a2 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.60.200.11:/data/gv1/brick1 Brick2: 10.60.200.12:/data/gv1/brick1 Brick2: 10.60.200.13:/data/gv1/brick1 挂载测试: mount -t glusterfs 10.60.200.11:/gv1 /mnt/test ~]# df -h /mnt/test/ Filesystem Size Used Avail Use% Mounted on 10.60.200.11:/gv1 5.4T 1.8T 3.4T 34% /mnt/test 写入fstab: 10.60.200.11:/gv1 /mnt/test glusterfs defaults,_netdev,backup-volfile-servers=10.60.200.12:10.60.200.13 0 0 2、挂载时,可以使用参数来提供备用节点 3.6以后兼容旧版本的方法: mount -t glusterfs -o backupvolfile-server=10.60.200.12 10.60.200.11:/gv1 /mnt/test 3.6以后推荐的方法: mount -t glusterfs -o backup-volfile-servers=10.60.200.12 10.60.200.11:/gv1 /mnt/test 多个backup的方法: mount -t glusterfs -o backup-volfile-servers=10.60.200.12:10.60.200.13 10.60.200.11:/gv1 /mnt/test 请特别注意: mount -t glusterfs 和 mount.glusterfs 的用法是不一样的。 3、参数配置 实例: ovirt的优化做了如下工作: --- 优化后,配置将做如下调整: Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on storage.owner-gid: 36 storage.owner-uid: 36 cluster.server-quorum-type: server cluster.quorum-type: auto network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off auth.allow: * user.cifs: enable nfs.disable: off performance.readdir-ahead: on --- ---配置卷,以gv1为例: gluster volume set gv1 diagnostics.count-fop-hits on gluster volume set gv1 diagnostics.latency-measurement on gluster volume set gv1 storage.owner-gid 36 gluster volume set gv1 storage.owner-uid 36 gluster volume set gv1 cluster.server-quorum-type server gluster volume set gv1 cluster.quorum-type auto gluster volume set gv1 network.remote-dio enable gluster volume set gv1 cluster.eager-lock enable gluster volume set gv1 performance.stat-prefetch off gluster volume set gv1 performance.io-cache off gluster volume set gv1 performance.read-ahead off gluster volume set gv1 performance.quick-read off gluster volume set gv1 auth.allow \* gluster volume set gv1 user.cifs enable gluster volume set gv1 nfs.disable off ---配置卷 二、调整相关 1、扩容 distributed-replicate 类型的节点组 1)初始状态 root@n11 ~]# gluster volume info Volume Name: gv1 Type: Distributed-Replicate Volume ID: 32b1866c-1743-4dd9-9429-6ecfdfa168a2 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.60.200.11:/data/gv1/brick1 Brick2: 10.60.200.12:/data/gv1/brick1 2)增加1对节点,注意,因为 replica=2,需要成对的添加 [root@n11 ~]# volume add-brick gv_test1 replica 2 10.60.200.21:/data/gv1/brick1 10.60.200.22:/data/gv1/brick1 [root@n11 ~]# gluster volume info Volume Name: gv1 Type: Distributed-Replicate Volume ID: 32b1866c-1743-4dd9-9429-6ecfdfa168a2 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.60.200.11:/data/gv1/brick1 Brick2: 10.60.200.12:/data/gv1/brick1 Brick3: 10.60.200.21:/data/gv1/brick1 Brick4: 10.60.200.22:/data/gv1/brick1 2、扩容 replica 类型的节点 1)初始状态 [root@n11 ~]# gluster volume info Volume Name: ovirt-data Type: Replicate Volume ID: 6a2f6489-bb8f-43d9-b46b-6258d95c7571 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.60.200.31:/data/ovirt/restore Brick2: 10.60.200.12:/data/ovirt/data 2)增加一个副本,从2个副本增加到3个 [root@n11 ~]# gluster volume add-brick [卷的名称] replica [新的副本数量] [新的节点路径] [root@n11 ~]# gluster volume add-brick ovirt-data replica 3 10.60.200.11:/data/ovirt/restore volume add-brick: failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600. 上述操作遇到异常,下面是解决办法: [root@n11 ~]# gluster volume set ovirt-data op-version 30600 volume set: failed: Option "cluster.op-version" is not valid for a single volume 说明这个参数是一个全局的参数,不能只针对单个卷有效。 [root@n11 ~]# gluster volume set all op-version 30600 volume set: success 再次操作: [root@n11 ~]# gluster volume add-brick ovirt-data replica 3 10.60.200.11:/data/ovirt/restore volume add-brick: success [root@n11 ~]# gluster volume info ovirt-data Volume Name: ovirt-data Type: Replicate Volume ID: 6a2f6489-bb8f-43d9-b46b-6258d95c7571 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.60.200.31:/data/ovirt/restore Brick2: 10.60.200.12:/data/ovirt/data Brick3: 10.60.200.11:/data/ovirt/restore 三、故障相关 1、修复实例:2个副本的复制卷,其中一个挂掉且不能上线后,怎么处理? 1)现状 gluster> volume info Volume Name: gv1 Type: Replicate Volume ID: 32b1866c-1743-4dd9-9429-6ecfdfa168a2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.60.200.11:/data/gv1/brick1 Brick2: 10.60.200.12:/data/gv1/brick1 2)假设 10.60.200.11 挂掉 准备替代的brick: 10.60.200.31:/data/gv1/brick1 3)操作 在 10.60.200.12 上执行 peer probe 来将新的节点加入集群: gluster> peer probe 10.60.200.31 执行 replace-brick 操作: gluster> volume replace-brick gv1 10.60.200.11:/data/gv1/brick1 10.60.200.31:/data/gv1/brick1 commit force volume replace-brick: success: replace-brick commit successful 查看状态: gluster> volume status Status of volume: gv1 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.60.200.31:/data/gv1/brick1 49153 Y 9188 Brick 10.60.200.12:/data/gv1/brick1 49152 Y 23944 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 14839 NFS Server on 10.60.200.31 N/A N N/A Self-heal Daemon on 10.60.200.31 N/A N 9199 Task Status of Volume ovirt-data ------------------------------------------------------------------------------ There are no active volume tasks gluster> volume info Volume Name: gv1 Type: Replicate Volume ID: 6a2f6489-bb8f-43d9-b46b-6258d95c7571 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.60.200.31:/data/gv1/brick1 Brick2: 10.60.200.12:/data/gv1/brick1 查看 heal 的状态: gluster> volume heal gv1 info Brick 10.60.200.31:/data/gv1/brick1 Number of entries: 0 Brick 10.60.200.12:/data/gv1/brick1 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/1d04adc8-9921-4e1c-b192-84fba64224db/bd3c9ecf-8100-42db-9db5-038dd95a4d54 - Possibly undergoing heal /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/1a719266-2b0e-4f6a-9586-825e5998c67b/74b85a40-847e-43ca-a74e-a09d5274339f /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/2d7f2fee-c796-4458-a7c2-86cb2e227834/2ef4b3fa-1656-4d15-8bd4-3356eab2d37e - Possibly undergoing heal /09cb8372-a68d-47dc-962e-70b5225be6bc/dom_md/outbox /09cb8372-a68d-47dc-962e-70b5225be6bc/master/vms /09cb8372-a68d-47dc-962e-70b5225be6bc/master/tasks /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/48722b60-7ce6-4714-942a-4cace1076c84/cba01bf2-5011-43d2-b724-d6bca8d49477 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/6b4c4ca2-8fab-476c-9321-f23872618837 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/b89e79cf-98bb-44a5-b9a0-33bbff2e4e22 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/560a7b8c-fbfe-4495-b315-e46593b4523e /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/e4c55db2-b33a-4c0d-8bc0-dc0c421ca302 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/e18d032d-c70b-4104-98b5-a3f7fb8d2e27 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/f5e408f3-eded-4bd2-93fd-205cbcd41714 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/2a629e0c-0029-480e-87f4-ac5f417a4448 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/48722b60-7ce6-4714-942a-4cace1076c84/cba01bf2-5011-43d2-b724-d6bca8d49477.meta /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/48722b60-7ce6-4714-942a-4cace1076c84/cba01bf2-5011-43d2-b724-d6bca8d49477.lease /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/5f11ddd0-9f74-4845-80ed-d3e5dc081d80/6c090128-6a01-4c0e-a3e2-df4a95e30894 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/2d7f2fee-c796-4458-a7c2-86cb2e227834/2ef4b3fa-1656-4d15-8bd4-3356eab2d37e.meta /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/2d7f2fee-c796-4458-a7c2-86cb2e227834/2ef4b3fa-1656-4d15-8bd4-3356eab2d37e.lease /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/5f11ddd0-9f74-4845-80ed-d3e5dc081d80/bc95b575-99fe-4cdf-a341-aeeed053e2b3 /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/5f11ddd0-9f74-4845-80ed-d3e5dc081d80/bc95b575-99fe-4cdf-a341-aeeed053e2b3.meta /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/5f11ddd0-9f74-4845-80ed-d3e5dc081d80/bc95b575-99fe-4cdf-a341-aeeed053e2b3.lease /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/5f11ddd0-9f74-4845-80ed-d3e5dc081d80/6c090128-6a01-4c0e-a3e2-df4a95e30894.meta /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/5f11ddd0-9f74-4845-80ed-d3e5dc081d80/6c090128-6a01-4c0e-a3e2-df4a95e30894.lease /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/1d04adc8-9921-4e1c-b192-84fba64224db/bd3c9ecf-8100-42db-9db5-038dd95a4d54.meta /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/1d04adc8-9921-4e1c-b192-84fba64224db/bd3c9ecf-8100-42db-9db5-038dd95a4d54.lease /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/1a719266-2b0e-4f6a-9586-825e5998c67b/74b85a40-847e-43ca-a74e-a09d5274339f.meta /09cb8372-a68d-47dc-962e-70b5225be6bc/p_w_picpaths/1a719266-2b0e-4f6a-9586-825e5998c67b/74b85a40-847e-43ca-a74e-a09d5274339f.lease Number of entries: 28 等待完成后: gluster> volume heal gv1 info Brick 10.60.200.31:/data/gv1/brick1 Number of entries: 0 Brick 10.60.200.12:/data/gv1/brick1 Number of entries: 0 2、在增加或者删除一个集群中的节点时,要注意的问题 现状: a+b 组成: replica 1x2 的集群。 ----------------------------- Number of Bricks: 1 x 2 = 2 Bricks: Brick1: a.test.com:/data/test Brick2: b.test.com:/data/test 增加c和d,变成:replica 1x4 的集群。 gluster volume add-brick gv1 replica 4 c.test.com:/data/test d.test.com:/data/test ----------------------------- Number of Bricks: 1 x 4 = 4 Bricks: Brick1: a.test.com:/data/test Brick2: b.test.com:/data/test Brick3: c.test.com:/data/test Brick4: d.test.com:/data/test 移除c,变成:replica 1x3 的集群 gluster volume remove-brick gv1 replica 3 c.test.com:/data/test force ----------------------------- Number of Bricks: 1 x 3 = 3 Bricks: Brick1: a.test.com:/data/test Brick2: b.test.com:/data/test Brick3: d.test.com:/data/test 这一步,可能导致数据错误,因为a+b+d,将重新heal数据。 此时的解决方法是: 移除d,变成:replica 1x2 的集群,因为a和b的数据最接近一致,heal的时间最短。 gluster volume remove-brick gv1 replica 2 d.test.com:/data/test force ----------------------------- Number of Bricks: 1 x 2 = 2 Bricks: Brick1: a.test.com:/data/test Brick2: b.test.com:/data/test ZYXW、参考 1、[Gluster-users] Replacing a failed brick https://www.gluster.org/pipermail/gluster-users/2013-August/014000.html 2、expand brick http://avid.force.com/pkb/articles/en_US/how_to/Adding-a-node-to-a-cluster-using-gluster https://bugzilla.redhat.com/show_bug.cgi?id=1168897 http://comments.gmane.org/gmane.comp.file-systems.gluster.user/20225 3、Resolving Peer Rejected http://gluster.readthedocs.io/en/latest/Administrator%20Guide/Resolving%20Peer%20Rejected/ 4、KVM虚拟化开源高可用方案(七)GLUSTERFS搭建及常见故障处理 http://xiaoli110.blog.51cto.com/1724/1071106?utm_source=tuicool&utm_medium=referral 5、Accessing Data - Setting Up GlusterFS Client http://gluster.readthedocs.io/en/latest/Administrator%20Guide/Setting%20Up%20Clients/