前面一篇写到了在CentOS上如何安装glusterfs,以及简单创建了一个volume并实现了native-mount,今天我们重点看一下在glusterfs上都可以创建哪种类型的volume.
1. 首先还是先介绍下实验环境,今天共用到了5台虚拟机,其中4个虚拟机做server端,分别是:
servera.lab.example.com
serverb.lab.example.com
serverc.lab.example.com
serverd.lab.example.com
1个虚拟机做client端:
workstation.lab.example.com
五台虚拟机之间都打开了防火墙,可以相互访问,并且可以相互解析hostname,4台server上已经安装好了glusterfs相关的package,并组成了一个trusted-storage-pool:
[root@servera ~]# gluster pool list UUID Hostname State d61aaee4-efe5-4f60-9216-c65fdb0b65f8 serverb.lab.example.com Connected be6e1fe5-ae7d-40d1-901e-8b58fd0a4be3 serverc.lab.example.com Connected 25dda180-3285-47f6-8595-3cb9bdaab92f serverd.lab.example.com Connected 860cd46c-390b-430f-8a01-d3433fa2775c localhost Connected
2.在4台server上分别创建5个bricks,并挂载到/bricks/thinvol$n目录下,依次执行以下命令(如非特殊说明,以下命令均为在4台server上执行,文中以servera为示例):
a. 实验环境中每个server上已经提前做好了一个lvm group vg_bricks
[root@servera ~]# vgs
VG #PV #LV #SN Attr VSize VFree
vg_bricks 1 10 0 wz--n- 20.00g 9.97g
b. 创建瘦逻辑卷
[root@servera ~]# for i in {1..9}; do lvcreate -L 2G -T vg_bricks/pool$i; done Logical volume "pool1" created. Logical volume "pool2" created. Logical volume "pool3" created. Logical volume "pool4" created. Logical volume "pool5" created. Logical volume "pool6" created. Logical volume "pool7" created. Logical volume "pool8" created. Logical volume "pool9" created. [root@servera ~]# for i in {1..9}; do lvcreate -L 2G -T vg_bricks/pool$i; done Logical volume pool1 already exists in Volume group vg_bricks. Logical volume pool2 already exists in Volume group vg_bricks. Logical volume pool3 already exists in Volume group vg_bricks. Logical volume pool4 already exists in Volume group vg_bricks. Logical volume pool5 already exists in Volume group vg_bricks. Logical volume pool6 already exists in Volume group vg_bricks. Logical volume pool7 already exists in Volume group vg_bricks. Logical volume pool8 already exists in Volume group vg_bricks. Logical volume pool9 already exists in Volume group vg_bricks.
c. 格式化逻辑卷并挂载到指定目录上,生产环境中挂载时建议将挂载命令写入/etc/fstab中以实现开机自动挂载
[root@servera ~]# for i in {1..9}; do mkfs -t xfs -i size=512 /dev/mapper/vg_bricks-thinvol$i; done [root@servera ~]# mkdir -p /bricks/thinvol{1..9} [root@servera ~]# for i in {1..9}; do mount -t xfs /dev/mapper/vg_bricks-thinvol$i /bricks/thinvol$i; done
d. 设置selinux安全上下文,这里默认selinux是打开的
[root@servera ~]# for i in {1..9}; do mkdir /bricks/thinvol$i/brick; done [root@servera ~]# chcon -R -t glusterd_brick_t /bricks/thinvol{1..9}
至此,准备工作可以告一段落了
3. 创建分布式卷Distributed volume
[root@servera ~]# gluster volume create Test01 \ > servera.lab.example.com:/bricks/thinvol1/brick \ > serverb.lab.example.com:/bricks/thinvol1/brick \ > serverc.lab.example.com:/bricks/thinvol1/brick \ > serverd.lab.example.com:/bricks/thinvol1/brick
volume create: Test01: success: please start the volume to access data
查看volume的状态,可以看到刚创建完后,volume处在“not started”状态
[root@servera ~]# gluster volume status Test01 Volume Test01 is not started
start volume
[root@servera ~]# gluster volume start Test01 volume start: Test01: success
此时查看volume的详细信息发现volume已started,volume 类型为distribute
[root@servera ~]# gluster volume info Test01 Volume Name: Test01 Type: Distribute Volume ID: f40beb82-81ae-42d2-bd1a-a7b9a24abe63 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: servera.lab.example.com:/bricks/thinvol1/brick Brick2: serverb.lab.example.com:/bricks/thinvol1/brick Brick3: serverc.lab.example.com:/bricks/thinvol1/brick Brick4: serverd.lab.example.com:/bricks/thinvol1/brick Options Reconfigured: performance.readdir-ahead: on
在workstation.lab.example.com上挂载volume Test01
[root@workstation ~]# yum install -y glusterfs-fuse [root@workstation ~]# mkdir /mnt/Test01 [root@workstation ~]# mount -t glusterfs servera.lab.example.com:Test01 /mnt/Test01 [root@workstation ~]# df -h | grep Test01 servera.lab.example.com:Test01 8.0G 131M 7.9G 2% /mnt/Test01
这里可以看到一些distributed volume的特点,有点类似raid0,volume的大小是组成volume的4个brick之和(每个brick大小为2G,共计2G*4=8G)
在/mnt/Test01下创建100个测试文件,会发现文件会“随机”分配到4个server的brick下
[root@workstation ~]# cd /mnt/Test01/ [root@workstation Test01]# touch {1..100}.file
[root@servera ~]# ls /bricks/thinvol1/brick/ 100.file 13.file 18.file 29.file 32.file 37.file 40.file 47.file 61.file 6.file 76.file 7.file 97.file 12.file 17.file 28.file 2.file 34.file 39.file 44.file 54.file 62.file 75.file 77.file 94.file
[root@serverb ~]# ls /bricks/thinvol1/brick/ 10.file 1.file 25.file 38.file 42.file 52.file 56.file 59.file 66.file 80.file 86.file 98.file 14.file 24.file 26.file 41.file 50.file 53.file 57.file 5.file 74.file 83.file 87.file ...
4. 创建复制卷Replicated volume
[root@servera ~]# gluster volume create Test02 replica 4 \ > servera.lab.example.com:/bricks/thinvol2/brick \ > serverb.lab.example.com:/bricks/thinvol2/brick \ > serverc.lab.example.com:/bricks/thinvol2/brick \ > serverd.lab.example.com:/bricks/thinvol2/brick volume create: Test02: success: please start the volume to access data
start volume Test02并查看volume的详细信息,可以看到volume的类型为Replicate即复制卷
[root@servera ~]# gluster volume start Test02 volume start: Test02: success [root@servera ~]# gluster volume info Test02 Volume Name: Test02 Type: Replicate Volume ID: 305f2c2f-8550-40e0-9d75-a2bc8149a333 Status: Started Number of Bricks: 1 x 4 = 4 Transport-type: tcp Bricks: Brick1: servera.lab.example.com:/bricks/thinvol2/brick Brick2: serverb.lab.example.com:/bricks/thinvol2/brick Brick3: serverc.lab.example.com:/bricks/thinvol2/brick Brick4: serverd.lab.example.com:/bricks/thinvol2/brick Options Reconfigured: performance.readdir-ahead: on
将Test02挂载到client下的/mnt/Test02目录下
[root@workstation Test01]# mkdir /mnt/Test02 [root@workstation Test01]# mount -t glusterfs servera.lab.example.com:Test02 /mnt/Test02 [root@workstation Test02]# df -h | grep Test02
servera.lab.example.com:Test02 2.0G 33M 2.0G 2% /mnt/Test02
这里可以看到Test02挂载后的大小为2G,即和组成Test02的每个brick一样大小,其特点类似raid1
在/mnt/Test02 下创建100个测试文件,会发现每个server的brick下都创建了100个测试文件
[root@workstation Test02]# touch {1..100}.file
[root@servera ~]# ls /bricks/thinvol2/brick/ 100.file 16.file 22.file 29.file 35.file 41.file 48.file 54.file 60.file 67.file 73.file 7.file 86.file 92.file 99.file 10.file 17.file 23.file 2.file 36.file 42.file 49.file 55.file 61.file 68.file 74.file 80.file 87.file 93.file 9.file 11.file 18.file 24.file 30.file 37.file 43.file 4.file 56.file 62.file 69.file 75.file 81.file 88.file 94.file 12.file 19.file 25.file 31.file 38.file 44.file 50.file 57.file 63.file 6.file 76.file 82.file 89.file 95.file 13.file 1.file 26.file 32.file 39.file 45.file 51.file 58.file 64.file 70.file 77.file 83.file 8.file 96.file 14.file 20.file 27.file 33.file 3.file 46.file 52.file 59.file 65.file 71.file 78.file 84.file 90.file 97.file 15.file 21.file 28.file 34.file 40.file 47.file 53.file 5.file 66.file 72.file 79.file 85.file 91.file 98.file
[root@serverb ~]# ls /bricks/thinvol2/brick/ | wc -w 100
[root@serverc ~]# ls /bricks/thinvol2/brick/ | wc -w 100
[root@serverd ~]# ls /bricks/thinvol2/brick/ | wc -w 100
5. 创建“分散卷”Dispersed volume
[root@servera ~]# for BRICKNUM in {3..5}; do > for node in {a..d}; do > echo server$node.lab.example.com:/bricks/thinvol$BRICKNUM/brick > done > done > /tmp/Test03.txt [root@servera ~]# cat /tmp/Test03.txt servera.lab.example.com:/bricks/thinvol3/brick serverb.lab.example.com:/bricks/thinvol3/brick serverc.lab.example.com:/bricks/thinvol3/brick serverd.lab.example.com:/bricks/thinvol3/brick servera.lab.example.com:/bricks/thinvol4/brick serverb.lab.example.com:/bricks/thinvol4/brick serverc.lab.example.com:/bricks/thinvol4/brick serverd.lab.example.com:/bricks/thinvol4/brick servera.lab.example.com:/bricks/thinvol5/brick serverb.lab.example.com:/bricks/thinvol5/brick serverc.lab.example.com:/bricks/thinvol5/brick serverd.lab.example.com:/bricks/thinvol5/brick [root@servera ~]# gluster volume create Test03 disperse-data 4 redundancy 2 $(这里第一次创建失败了,根据提示信息可以看出在disperse volume里,默认是“不建议”多个brick来自同一个server的,但是我们这里受限于实验环境......如果“坚持”要使用来自同一个server的brick,需要在命令中加入一个force 选项。
[root@servera ~]# gluster volume create Test03 disperse-data 4 redundancy 2 $([root@servera ~]# gluster volume start Test03 volume start: Test03: success[root@servera ~]# gluster volume info Test03 Volume Name: Test03 Type: Distributed-Disperse Volume ID: b4d90a15-9a2b-46ec-a90f-e6f96b48d77b Status: Started Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: servera.lab.example.com:/bricks/thinvol3/brick Brick2: serverb.lab.example.com:/bricks/thinvol3/brick Brick3: serverc.lab.example.com:/bricks/thinvol3/brick Brick4: serverd.lab.example.com:/bricks/thinvol3/brick Brick5: servera.lab.example.com:/bricks/thinvol4/brick Brick6: serverb.lab.example.com:/bricks/thinvol4/brick Brick7: serverc.lab.example.com:/bricks/thinvol4/brick Brick8: serverd.lab.example.com:/bricks/thinvol4/brick Brick9: servera.lab.example.com:/bricks/thinvol5/brick Brick10: serverb.lab.example.com:/bricks/thinvol5/brick Brick11: serverc.lab.example.com:/bricks/thinvol5/brick Brick12: serverd.lab.example.com:/bricks/thinvol5/brick Options Reconfigured: performance.readdir-ahead: on通过上面的命令,我们已经创建、启动了一个dispersed volume Test03,准确的说应该是一个2 x (4 + 2)的Distributed-disperse volume。
###Redhat建议的redundancy level:a. 6=4 + 2 b. 11=8+3 c.12=8+4
将dispersed volume Test03 挂载到client server /mnt/Test03目录下
[root@workstation Test02]# mkdir /mnt/Test03 [root@workstation Test02]# mount -t glusterfs servera.lab.example.com:Test03 /mnt/Test03 [root@workstation Test03]# df -h | grep Test03 servera.lab.example.com:Test03 16G 361M 16G 3% /mnt/Test03从上可以看出Test03的可用空间大小为16G,相当于8个bricks(
servera.lab.example.com:/bricks/thinvol3/brick serverb.lab.example.com:/bricks/thinvol3/brick)的大小,利用率为16/(2 * 12)
在/mnt/Test03下创建一个100MB的测试文件:
[root@workstation Test03]# dd if=/dev/zero of=/mnt/Test03/100M.file bs=1M count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 21.1374 s, 5.0 MB/s[root@workstation Test03]# ls -lh total 100M -rw-r--r--. 1 root root 100M Jul 29 19:44 100M.file查看每个server上的brick使用情况
[root@servera ~]# ls -lh /bricks/thinvol{3..5}/brick /bricks/thinvol3/brick: total 25M -rw-r--r--. 2 root root 25M Jul 29 19:44 100M.file /bricks/thinvol4/brick: total 25M -rw-r--r--. 2 root root 25M Jul 29 19:44 100M.file /bricks/thinvol5/brick: total 0[root@serverb ~]# ls -lh /bricks/thinvol{3..5}/brick /bricks/thinvol3/brick: total 25M -rw-r--r--. 2 root root 25M Jul 29 19:44 100M.file /bricks/thinvol4/brick: total 25M -rw-r--r--. 2 root root 25M Jul 29 19:44 100M.file /bricks/thinvol5/brick: total 0[root@serverc ~]# ls -lh /bricks/thinvol{3..5}/brick /bricks/thinvol3/brick: total 25M -rw-r--r--. 2 root root 25M Jul 29 19:44 100M.file /bricks/thinvol4/brick: total 0 /bricks/thinvol5/brick: total 0[root@serverd ~]# ls -lh /bricks/thinvol{3..5}/brick /bricks/thinvol3/brick: total 25M -rw-r--r--. 2 root root 25M Jul 29 19:44 100M.file /bricks/thinvol4/brick: total 0 /bricks/thinvol5/brick: total 0被使用的是如下bricks:
Brick1: servera.lab.example.com:/bricks/thinvol3/brick Brick2: serverb.lab.example.com:/bricks/thinvol3/brick Brick3: serverc.lab.example.com:/bricks/thinvol3/brick Brick4: serverd.lab.example.com:/bricks/thinvol3/brick Brick5: servera.lab.example.com:/bricks/thinvol4/brick Brick6: serverb.lab.example.com:/bricks/thinvol4/brick
6. 创建复合卷Combined volume
复合卷就是前面三种卷的组合,可以是distributed-replicated,也可以是distributed-dispersed,但是貌似没有replicated-dispersed的组合
[root@servera ~]# gluster volume create Test04 replica 3 \ > servera.lab.example.com:/bricks/thinvol6/brick \ > serverb.lab.example.com:/bricks/thinvol6/brick \ > serverc.lab.example.com:/bricks/thinvol6/brick \ > servera.lab.example.com:/bricks/thinvol7/brick \ > serverb.lab.example.com:/bricks/thinvol7/brick \ > serverc.lab.example.com:/bricks/thinvol7/brick volume create: Test04: success: please start the volume to access data [root@servera ~]# gluster volume start Test04 volume start: Test04: success [root@servera ~]# gluster volume info Test04 Volume Name: Test04 Type: Distributed-Replicate Volume ID: 45df998d-2d14-4af2-83c7-e497bd5a8dd0 Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: servera.lab.example.com:/bricks/thinvol6/brick Brick2: serverb.lab.example.com:/bricks/thinvol6/brick Brick3: serverc.lab.example.com:/bricks/thinvol6/brick Brick4: servera.lab.example.com:/bricks/thinvol7/brick Brick5: serverb.lab.example.com:/bricks/thinvol7/brick Brick6: serverc.lab.example.com:/bricks/thinvol7/brick Options Reconfigured: performance.readdir-ahead: on从上面可以看出volume Test04的类型为Distributed-Replicate
在client上挂载Test04,Test04的可用空间为4G
[root@workstation Test03]# mkdir /mnt/Test04 [root@workstation Test03]# mount -t glusterfs servera.lab.example.com:Test04 /mnt/Test04 [root@workstation Test03]# df -h | grep Test04 servera.lab.example.com:Test04 4.0G 66M 4.0G 2% /mnt/Test04在/mnt/Test04下创建100个测试文件,并查看下数据分布
[root@workstation Test04]# touch {1..100}.file[root@servera thinvol4]# ls /bricks/thinvol4/brick/ | wc -w 48 [root@servera thinvol4]# ls /bricks/thinvol5/brick/ | wc -w 52 [root@servera thinvol4]#[root@serverb ~]# ls /bricks/thinvol4/brick/ | wc -w 48 [root@serverb ~]# ls /bricks/thinvol5/brick/ | wc -w 52[root@serverc ~]# ls /bricks/thinvol4/brick/ | wc -w 48 [root@serverc ~]# ls /bricks/thinvol5/brick/ | wc -w 52可以发现
Brick1: servera.lab.example.com:/bricks/thinvol4/brick Brick2: serverb.lab.example.com:/bricks/thinvol4/brick Brick3: serverc.lab.example.com:/bricks/thinvol4/brick组成了一个3副本,每个brick上的数据一致
Brick4: servera.lab.example.com:/bricks/thinvol5/brick Brick5: serverb.lab.example.com:/bricks/thinvol5/brick Brick6: serverc.lab.example.com:/bricks/thinvol5/brick组成了一个3副本,每个brick上的数据一致
当然了,这里的例子限于实验环境的局限性,每个server上选取了两个brick,生产环境中还是要尽量杜绝这种情况的。Ok,今天就到这里了。