基于IntelCAS加速的Glusterfs常见运维

Intel CAS全称是Intel cache acceleration software,
这里是官方网站:http://www.intel.com/support/go/cas

一、IntelCAS的使用

这里是IntelCAS 的帮助文档。
在搭配Intel自己的闪存产品时,这个软件没有cache的容量限制。
如果使用其他品牌的闪存产品,则每个cache最大只能200GB。

介绍下一个配置工具CAS   casadm
[root@node1 ~]# casadm -H
WARNING: Intel(R) CAS is running on a non-validated OS!
Intel(R) Cache Acceleration Software Linux

Usage: casadm <command> [option...]

Available commands:
   -S  --start-cache              Start new cache instance or load using metadata
   -T  --stop-cache               Stop cache instance
   -X  --set-param                Set various runtime parameters
   -G  --get-param                Get various runtime parameters
   -Q  --set-cache-mode           Set cache mode
   -A  --add-core                 Add core device to cache instance
   -R  --remove-core              Remove core device from cache instance
       --remove-detached          Remove core device from core pool
   -L  --list-caches              List all cache instances and core devices
   -P  --stats                    Print statistics for cache instance
   -Z  --reset-counters           Reset cache statistics for core device within cache instance
   -F  --flush-cache              Flush all dirty data from the caching device to core devices
   -E  --flush-core               Flush dirty data of a given core from the caching device to this core device
   -D  --flush-parameters         Change flush thread parameters (OBSOLETE)
   -C  --io-class                 Manage IO classes
   -N  --nvme                     Manage NVMe namespace
   -V  --version                  Print Intel(R) CAS version
   -H  --help                     Print help

For detailed help on the above commands use --help after the command.
e.g.
   casadm --start-cache --help
For more information, please refer to manual, Admin Guide (man casadm)
or go to support page <http://www.intel.com/support/go/cas>.
[root@node1 ~]# 

二、Intel CAS在linux下的安装

安装程序名一般是:

installer-Intel-CAS-xxxxxxxxxxxxxxxxxxx.run

ftp到主机,并给它执行权限,直接执行就好了。

目前最新版本是3.6,这个版本支持的OS:rhel6.86.97.37.4,centos 6.86.97.37.4,
如果用OEL,要把内核切换到rhel的兼容内核。

三、Intel CAS配置

1 、核心的配置文件是/etc/intelcas/intelcas.conf
[root@node1 intelcas]# ls
intelcas.conf  ioclass-config.csv  validated_configurations.csv
[root@node1 intelcas]# pwd
/etc/intelcas
[root@node1 intelcas]# 

在这里插入图片描述

2 、配置文件解析

安装过后,会自动生成这个配置文件 ,初始时所有行都是注释掉的,注释中包含说明和例子。

是注释掉的,去掉注释,根据自己的情况修改:

## Caches configuration section
 
[caches]
 
## Cache ID Cache device Cache mode Extra fields (optional)
 
## Uncomment and edit the below line for cache configuration
 
1 /dev/disk/by-id/nvme-INTEL_SSDPEDMW012T4_CVCQ5146005Q1P2BGN WB
 
## Core devices configuration
 
[cores]
 
## Cache ID Core ID Core device
 
## Uncomment and edit the below line for core configuration
 
1 1 /dev/disk/by-id/wwn-0x5000c5008f07fcaf

这里需要注意一下,为了防止磁盘的飘动,磁盘应写id:

查看ID的办法如下:

[root@node1 intelcas]# ls -l /dev/disk/by-id/
total 0
lrwxrwxrwx 1 root root  9 Dec  3 08:56 ata-QEMU_DVD-ROM_QM00001 -> ../../sr0
lrwxrwxrwx 1 root root  9 Dec  5 16:31 virtio-49ac23b4-7bc0-4878-a -> ../../vdf
lrwxrwxrwx 1 root root  9 Dec  5 16:31 virtio-61d85d0a-6b60-4e90-9 -> ../../vdb
lrwxrwxrwx 1 root root  9 Dec  5 16:31 virtio-63623f20-e2d3-4c68-8 -> ../../vdh
lrwxrwxrwx 1 root root  9 Dec  5 16:31 virtio-7af8271a-500d-463d-8 -> ../../vdd
lrwxrwxrwx 1 root root  9 Dec  5 16:31 virtio-83930fe5-3ab3-418b-b -> ../../vdg
lrwxrwxrwx 1 root root  9 Dec  5 16:31 virtio-c3349770-d5e3-494d-b -> ../../vde
lrwxrwxrwx 1 root root  9 Dec  5 16:31 virtio-c8e6629c-d6e5-43ce-9 -> ../../vdc

另外:

<缓存ID>是一个介于116,384之间的数值(有效的缓存实例号)。
•<Core ID>04095之间的数值(有效的Core实例号)
•缓存和核心设备必须指向现有的HDD和SSD设备,最好通过by-id名称引用(ls -l /dev/disk/by-id)。
核心设备应该引用WWN标识符,而缓存设备应该使用型号和序列号。
或者,可以通过设备名称引用设备
(如。例如当运行casadm命令时。)
模式决定缓存模式,可以是写进模式、写回模式、写进模式或传递模式。
•可选:额外的标志允许额外的设置缓存和核心设备,逗号分隔。
•ioclass_file允许用户加载指定该缓存的IO类策略的文件。
•cleaning_policy允许用户指定用于此缓存的缓存清理策略,可以是acp、alru或nop

3 、内容说明

[caches]部分是主要是3列,cache id,设备名,和cache模式

Cache id1开始排列,多块PCI SSD做cache,就依次编号,设备名建议用/dev/disk/by-id/目录下的别名,
例子里是使用intel PCI SSD的一个例子,cache模式是WB,write back, 也可以是WT(write through), PT(pass through)。

[cores]部分就是被加速的机械盘设备,也是3列,cache id就是告诉系统用哪个设备做cache,
例子里就是上面刚定义的1号cache设备,core id也是从1开始依次排列,
core device即使机械硬盘设备名,也是建议用/dev/disk/by-id/目录下的别名。

四、 初始化CAS

[root@node1 intelcas]#  intelcas init


如果闪存盘上有数据或文件系统,需要使用:

[root@node1 intelcas]#  intelcas init  -force


初始化后自动启动CAS,然后就能看到应用了cache后的设备了。

core id和cache id构成cache后的设备名,
上面例子中加速后的机械硬盘的设备名就是/dev/intelcas1-1

如果机械硬盘太大,给oracle时要分区,就对/dev/intelcas1-1进行分区操作,
不要对底层机械硬盘进行分区,上面的例子里,
就是不要对/dev/disk/by-id/wwn-0x5000c5008f07fcaf

进行分区,或者不要对真正的/dev/sdX进行分区。

分区后的设备就是/dev/intelcas1-1p1, /dev/intelcas1-1p2……

五、 启动/关闭CAS

只要配置了/etc/intelcas/intelcas.conf, 系统重启后,会自动启动CAS, 不需要人工干预。

如果要手工停止和启动,使用命令:
[root@node1 intelcas]#   intelcas start/stop

六 、 使用casadm工具进行配置分析:

环境准备:

主机 磁盘 HDD SSD
node1 /dev/vdb、/dev/vdc、/dev/vdd /dev/vde
node2 /dev/vdb、/dev/vdc、/dev/vdd /dev/vde
node3 /dev/vdb、/dev/vdc、/dev/vdd /dev/vde

eg:

[root@node1 intelcas]# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0     11:0    1  478K  0 rom  
vda    253:0    0   50G  0 disk 
├─vda1 253:1    0  200M  0 part /boot
└─vda2 253:2    0 49.8G  0 part /
vdb    253:16   0   50G  0 disk 
vdc    253:32   0   50G  0 disk 
vdd    253:48   0   50G  0 disk 
vde    253:64   0   20G  0 disk 

[root@node2 ~]# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0     11:0    1  478K  0 rom  
vda    253:0    0   50G  0 disk 
├─vda1 253:1    0  200M  0 part /boot
└─vda2 253:2    0 49.8G  0 part /
vdb    253:16   0   50G  0 disk 
vdc    253:32   0   50G  0 disk 
vdd    253:48   0   50G  0 disk 
vde    253:64   0   20G  0 disk


[root@node3 ~]# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0     11:0    1  478K  0 rom  
vda    253:0    0   50G  0 disk 
├─vda1 253:1    0  200M  0 part /boot
└─vda2 253:2    0 49.8G  0 part /
vdb    253:16   0   50G  0 disk 
vdc    253:32   0   50G  0 disk 
vdd    253:48   0   50G  0 disk 
vde    253:64   0   20G  0 disk

(要注意配置完之后,要将配置内容写进/etc/intelcas/intelcas.conf中)

① 首先看下Intelcas的状态

[root@node1 intelcas]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
No caches running
[root@node1 intelcas]# 


②下面进行配置(这里仅仅是演示只在node1节点上进行操作,事实上三个节点都需要操作)
首先添加cache设备

[root@node1 intelcas]#  casadm -S -i 1 -d /dev/vde 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added cache instance 1
[root@node1 intelcas]# 

其次添加core设备,前提是要指定绑定的cache设备

[root@node1 intelcas]# casadm -A -i 1 -j 1 -d /dev/vdb 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added core 1 to cache instance 1
[root@node1 intelcas]# casadm -A -i 1 -j 2 -d /dev/vdc 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added core 2 to cache instance 1
[root@node1 intelcas]# casadm -A -i 1 -j 3 -d /dev/vdd 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added core 3 to cache instance 1
[root@node1 intelcas]# 

查看结果如下:(一拖三)

[root@node1 intelcas]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wt             -
+core   1    /dev/vdb   Active    -              /dev/intelcas1-1
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3

更改一下模型策略

[root@node1 intelcas]# casadm -Q -c wb -i 1 
WARNING: Intel(R) CAS is running on a non-validated OS!
[root@node1 intelcas]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wb             -
+core   1    /dev/vdb   Active    -              /dev/intelcas1-1
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3

然后通过ls - l /dev/disk/by-id/…获取下每个设备的id,并写进配置文件

#  |  |     '  |   |.-'   |      (      ----  '-.
#  |  |     |  |   \      |       \    (   |     )
#  '- '     '  '--  '--'  '--      '--  --' ' --'

# Intel(R) CAS configuration file - for reference on syntax
# of this file please refer to appropriate documentation

# NOTES:
# 1) It is highly recommended to specify cache/core device using path
# that is constant across reboots - e.g. disk device links in
# /dev/disk/by-id/, preferably those using device WWN if available:
#   /dev/disk/by-id/wwn-0x123456789abcdef0
# Referencing devices via /dev/sd* may result in cache misconfiguration after
# system reboot due to change(s) in drive order.

## Caches configuration section
[caches]
## Cache ID     Cache device                            Cache mode      Extra fields (optional)
## Uncomment and edit the below line for cache configuration
#1              /dev/disk/by-id/nvme-INTEL_SSDP..       WT
1               /dev/disk/by-id/virtio-c3349770-d5e3-494d-b     WB

## Core devices configuration
[cores]
## Cache ID     Core ID         Core device
## Uncomment and edit the below line for core configuration
#1              1               /dev/disk/by-id/wwn-0x123456789abcdef0
1               1               /dev/disk/by-id/virtio-61d85d0a-6b60-4e90-9
1               2               /dev/disk/by-id/virtio-c8e6629c-d6e5-43ce-9
1               3               /dev/disk/by-id/virtio-7af8271a-500d-463d-8

## To specify use of the IO Classification file, place content of the following line in the
## Caches configuration section under Extra fields (optional)
## ioclass_file=/etc/intelcas/ioclass-config.csv

③:将设备进行挂载:
先创建挂载点:

[root@node1 data]# mkdir intelcas1-{1..3}
[root@node1 data]# ls
intelcas1-1  intelcas1-2  intelcas1-3

然后将挂载设备文件写入 /etc/fstab文件中


[root@node1 ~]# vi /etc/fstab

#
# /etc/fstab
# Created by anaconda on Mon Mar 18 04:25:44 2019
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=eb5e1ffc-7d7f-4ed3-bd90-0c7d23eeb7c2 /                       xfs     defaults        0 0
UUID=5025c7e6-262c-4930-a2be-e32ffddbdf42 /boot                   xfs     defaults        0 0
#/dev/vdb /data/brick1 xfs defaults 0 0
#/dev/vdc /data/brick2 xfs defaults 0 0
#/dev/vdd /data/brick3 xfs defaults 0 0
/dev/intelcas1-1       /data/intelcas1-1       xfs defaults 0 0
/dev/intelcas1-2       /data/intelcas1-2       xfs defaults 0 0
/dev/intelcas1-3       /data/intelcas1-3       xfs defaults 0 0

然后执行mount命令进行挂载

[root@node1 data]#  mount   -a 
[root@node1 data]# df -h
Filesystem        Size  Used Avail Use% Mounted on
/dev/vda2          50G   43G  7.2G  86% /
devtmpfs          1.9G     0  1.9G   0% /dev
tmpfs             1.9G     0  1.9G   0% /dev/shm
tmpfs             1.9G  143M  1.8G   8% /run
tmpfs             1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/vda1         197M  113M   85M  58% /boot
tmpfs             380M     0  380M   0% /run/user/0
/dev/intelcas1-1   50G   33M   50G   1% /data/intelcas1-1
/dev/intelcas1-2   50G   33M   50G   1% /data/intelcas1-2
/dev/intelcas1-3   50G   33M   50G   1% /data/intelcas1-3

---------------------- 以上操作均需要在各个节点执行------------------------------

效果如下:
Node3:(node2 一样操作)

[root@node3 ~]# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0     11:0    1  478K  0 rom  
vda    253:0    0   50G  0 disk 
├─vda1 253:1    0  200M  0 part /boot
└─vda2 253:2    0 49.8G  0 part /
vdb    253:16   0   50G  0 disk 
vdc    253:32   0   50G  0 disk 
vdd    253:48   0   50G  0 disk 
vde    253:64   0   20G  0 disk 
vdf    253:80   0   50G  0 disk 
vdg    253:96   0   50G  0 disk 
[root@node3 ~]# casadm -S -i 1 -d  /dev/vde 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added cache instance 1
[root@node3 ~]# casadm -A -i 1   -j 1 -d  /dev/vdb 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added core 1 to cache instance 1
[root@node3 ~]# casadm -A -i 1   -j 2 -d  /dev/vdc 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added core 2 to cache instance 1
[root@node3 ~]# casadm -A -i 1   -j 3 -d  /dev/vdd 
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added core 3 to cache instance 1
[root@node3 ~]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wt             -
+core   1    /dev/vdb   Active    -              /dev/intelcas1-1
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3
[root@node3 ~]# 
[root@node3 ~]# df -h        
Filesystem        Size  Used Avail Use% Mounted on
/dev/vda2          50G  3.1G   47G   7% /
devtmpfs          1.9G     0  1.9G   0% /dev
tmpfs             1.9G     0  1.9G   0% /dev/shm
tmpfs             1.9G   57M  1.8G   3% /run
tmpfs             1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/vda1         197M  113M   85M  58% /boot
tmpfs             380M     0  380M   0% /run/user/0
/dev/intelcas1-1   50G   33M   50G   1% /data/intelcas1-1
/dev/intelcas1-2   50G   33M   50G   1% /data/intelcas1-2
/dev/intelcas1-3   50G   33M   50G   1% /data/intelcas1-3

七 、 启动glusterfs进行volume的创建

GlusterFS 五种卷  

  

 - Distributed:分布式卷,文件通过 hash 算法随机分布到由 bricks 组成的卷上。


 - Replicated: 复制式卷,类似 RAID 1,replica 数必须等于 volume 中 brick 所包含的存储服务器数,可用性高。
 - Striped: 条带式卷,类似 RAID 0,stripe 数必须等于 volume 中 brick 所包含的存储服务器数,文件被分成数据块,
    以 Round Robin 的方式存储在 bricks 中,并发粒度是数据块,大文件性能好。
 - Distributed Striped: 分布式的条带卷,volume中 brick 所包含的存储服务器数必须是 stripe 的倍数(>=2倍),兼顾分布式和条带式的功能。
 - Distributed Replicated: 分布式的复制卷,volume 中 brick 所包含的存储服务器数必须是 replica 的倍数(>=2倍),兼顾分布式和复制式的功能。
 -  分布式复制卷的brick顺序决定了文件分布的位置,一般来说,先是两个brick形成一个复制关系,然后两个复制关系形成分布。

 -  企业一般用后两种,大部分会用分布式复制(可用容量为 总容量/复制份数),通过网络传输的话最好用万兆交换机,万兆网卡来做。
     这样就会优化一部分性能。它们的数据都是通过网络来传输的。

[root@node1 data]# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-12-04 11:54:00 CST; 4 days ago
 Main PID: 8289 (glusterd)
   CGroup: /system.slice/glusterd.service
           └─8289 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Dec 04 11:53:59 node1 systemd[1]: Starting GlusterFS, a clustered file-system server...
Dec 04 11:54:00 node1 systemd[1]: Started GlusterFS, a clustered file-system server.
[root@node1 data]# 


查看卷的列表
[root@node1 data]# gluster volume list
No volumes present in cluster

创建卷
(这里卷的类型有很多,根据需求来创建)


[root@node1 data]# gluster volume create 

Usage:
volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT> [arbiter <COUNT>]] [disperse [<COUNT>]] [disperse-data <COUNT>] [redundancy <COUNT>] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK>?<vg_name>... [force]

[root@node1 data]# 
[root@node1 ~]# gluster volume create gv1 replica 3 node1:/data/intelcas1-1/ node2:/data/intelcas1-1 node3:/data/intelcas1-1   node1:/data/intelcas1-2 node2:/data/intelcas1-2  node3:/data/intelcas1-2  node1:/data/intelcas1-3 node2:/data/intelcas1-3 node3:/data/intelcas1-3/  force 
volume create: gv1: success: please start the volume to access data
[root@node1 ~]# gluster fs volume list
unrecognized word: fs (position 0)
[root@node1 ~]# gluster volume list
gv1
[root@node1 ~]# gluster volume start gv1 
volume start: gv1: success
[root@node1 ~]# gluster volume info gv1
 
Volume Name: gv1
Type: Distributed-Replicate
Volume ID: 4f90131d-bdc6-488e-b86a-ff77a92fedab
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 3 = 9
Transport-type: tcp
Bricks:
Brick1: node1:/data/intelcas1-1
Brick2: node2:/data/intelcas1-1
Brick3: node3:/data/intelcas1-1
Brick4: node1:/data/intelcas1-2
Brick5: node2:/data/intelcas1-2
Brick6: node3:/data/intelcas1-2
Brick7: node1:/data/intelcas1-3
Brick8: node2:/data/intelcas1-3
Brick9: node3:/data/intelcas1-3
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
[root@node1 ~]# 


这里可以把nfs打开:
[root@node1 ~]# gluster volume set gv1  nfs.disable off
Gluster NFS is being deprecated in favor of NFS-Ganesha Enter "yes" to continue using Gluster NFS (y/n) y
volume set: success
[root@node1 ~]# 
[root@node1 dev]# gluster volume get gv1  nfs.disable  
Option                                  Value                                   
------                                  -----                                   
nfs.disable                             off                                     
[root@node1 dev]# 

[root@node1 ~]# gluster volume info gv1                
 
Volume Name: gv1
Type: Distributed-Replicate
Volume ID: 4f90131d-bdc6-488e-b86a-ff77a92fedab
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 3 = 9
Transport-type: tcp
Bricks:
Brick1: node1:/data/intelcas1-1
Brick2: node2:/data/intelcas1-1
Brick3: node3:/data/intelcas1-1
Brick4: node1:/data/intelcas1-2
Brick5: node2:/data/intelcas1-2
Brick6: node3:/data/intelcas1-2
Brick7: node1:/data/intelcas1-3
Brick8: node2:/data/intelcas1-3
Brick9: node3:/data/intelcas1-3
Options Reconfigured:
transport.address-family: inet
nfs.disable: off
performance.client-io-threads: off
[root@node1 ~]# 

挂载后就能正常使用了:

[root@node1 ~]# mount.glusterfs  node1:/gv1 /mnt/
[root@node1 mnt]# touch {1..10}.txt
[root@node1 ~]# cd /mnt/
[root@node1 mnt]# ls
10.txt  1.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt

八 、进行坏盘故障模拟

[root@node1 mnt]# gluster volume status 
Status of volume: gv1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data/intelcas1-1               49152     0          Y       16603
Brick node2:/data/intelcas1-1               49155     0          Y       10689
Brick node3:/data/intelcas1-1               49155     0          Y       762  
Brick node1:/data/intelcas1-2               49155     0          Y       16625
Brick node2:/data/intelcas1-2               49156     0          Y       10711
Brick node3:/data/intelcas1-2               49156     0          Y       785  
Brick node1:/data/intelcas1-3               49156     0          Y       16647
Brick node2:/data/intelcas1-3               49157     0          Y       10733
Brick node3:/data/intelcas1-3               49157     0          Y       807  
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       16670
NFS Server on node3                         N/A       N/A        N       N/A  
Self-heal Daemon on node3                   N/A       N/A        Y       831  
NFS Server on node2                         N/A       N/A        N       N/A  
Self-heal Daemon on node2                   N/A       N/A        Y       10756
 
Task Status of Volume gv1
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@node1 mnt]# 

这里假设下面这个盘出现故障,需要更换。

Brick node1:/data/intelcas1-1               49152     0          Y       16603

① 可以使用命令杀掉进程,或者在该节点使用kill命令杀掉进程

[root@node1 mnt]#  gluster volume reset-brick glustervolume.......

当然,也可以使用kill
[root@node1 mnt]# kill -15  16603


[root@node1 mnt]# gluster volume status 
Status of volume: gv1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data/intelcas1-1               N/A       N/A        N       N/A   #发现磁盘已经出现故障
Brick node2:/data/intelcas1-1               49155     0          Y       10689
Brick node3:/data/intelcas1-1               49155     0          Y       762  
Brick node1:/data/intelcas1-2               49155     0          Y       16625
Brick node2:/data/intelcas1-2               49156     0          Y       10711
Brick node3:/data/intelcas1-2               49156     0          Y       785  
Brick node1:/data/intelcas1-3               49156     0          Y       16647
Brick node2:/data/intelcas1-3               49157     0          Y       10733
Brick node3:/data/intelcas1-3               49157     0          Y       807  
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       16670
NFS Server on node2                         N/A       N/A        N       N/A  
Self-heal Daemon on node2                   N/A       N/A        Y       10756
NFS Server on node3                         N/A       N/A        N       N/A  
Self-heal Daemon on node3                   N/A       N/A        Y       831  
 
Task Status of Volume gv1
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@node1 mnt]# 

② 然后重新模拟出一块一模一样的磁盘,并先将原坏磁盘解挂移除,在使用新盘替换。

root@node1 mnt]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wb             -
+core   1    /dev/vdb   Active    -              /dev/intelcas1-1
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3
[root@node1 mnt]# umount /dev/intelcas1-1 


使用/dev/vdf替换  /dev/vdb
[root@node1 mnt]# lsblk 
NAME          MAJ:MIN  RM  SIZE RO TYPE MOUNTPOINT
sr0            11:0     1  478K  0 rom  
vda           253:0     0   50G  0 disk 
├─vda1        253:1     0  200M  0 part /boot
└─vda2        253:2     0 49.8G  0 part /
vdb           253:16    0   50G  0 disk 
└─intelcas1-1 252:2048  0   50G  0 disk 
vdc           253:32    0   50G  0 disk 
└─intelcas1-2 252:2304  0   50G  0 disk /data/intelcas1-2
vdd           253:48    0   50G  0 disk 
└─intelcas1-3 252:2560  0   50G  0 disk /data/intelcas1-3
vde           253:64    0   20G  0 disk 
vdf           253:80    0   50G  0 disk 
vdg           253:96    0   50G  0 disk 
vdh           253:112   0   50G  0 disk 


移除brick

[root@node1 mnt]# casadm  -R -i 1 -j 1 
WARNING: Intel(R) CAS is running on a non-validated OS!
[root@node1 mnt]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wb             -
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3

把新的磁盘写系统,然后做core加速
(针对于mkfs.xfs /dev/vdf   要比mkfs.xfs /dev/intelcas1-1 快速)
[root@node1 dev]# mkfs.xfs /dev/vdf 
meta-data=/dev/vdf               isize=512    agcount=4, agsize=3276800 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=13107200, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=6400, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@node1 dev]# casadm -A -i 1 -j 1 -d /dev/vdf           
WARNING: Intel(R) CAS is running on a non-validated OS!
Successfully added core 1 to cache instance 1


[root@node1 dev]# mount /dev/intelcas1-1  /data/intelcas1-1/
[root@node1 dev]# 

[root@node1 dev]# df -h
Filesystem        Size  Used Avail Use% Mounted on
/dev/vda2          50G   43G  7.2G  86% /
devtmpfs          1.9G     0  1.9G   0% /dev
tmpfs             1.9G     0  1.9G   0% /dev/shm
tmpfs             1.9G  143M  1.8G   8% /run
tmpfs             1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/vda1         197M  113M   85M  58% /boot
tmpfs             380M     0  380M   0% /run/user/0
/dev/intelcas1-2   50G   33M   50G   1% /data/intelcas1-2
/dev/intelcas1-3   50G   33M   50G   1% /data/intelcas1-3
node1:/gv1        150G  1.6G  149G   2% /mnt
/dev/intelcas1-1   50G   33M   50G   1% /data/intelcas1-1
然后重新替换brick

[root@node1 dev]# gluster volume list
gv1

[root@node1 dev]# volume reset-brick   {{start} | { commit}} - reset-brick operations 

[root@node1 dev]# gluster volume reset-brick gv1 node1:/data/intelcas1-1/ node1:/data/intelcas1-1/ commit force
volume reset-brick: success: reset-brick commit force operation successful
[root@node1 dev]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wb             -
+core   1    /dev/vdf   Active    -              /dev/intelcas1-1
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3

检查一下状态:
[root@node1 dev]# gluster volume status
Status of volume: gv1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data/intelcas1-1               49152     0          Y       17584 ##已经成功替换
Brick node2:/data/intelcas1-1               49155     0          Y       10689
Brick node3:/data/intelcas1-1               49155     0          Y       762  
Brick node1:/data/intelcas1-2               49155     0          Y       16625
Brick node2:/data/intelcas1-2               49156     0          Y       10711
Brick node3:/data/intelcas1-2               49156     0          Y       785  
Brick node1:/data/intelcas1-3               49156     0          Y       16647
Brick node2:/data/intelcas1-3               49157     0          Y       10733
Brick node3:/data/intelcas1-3               49157     0          Y       807  
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       17592
NFS Server on node3                         N/A       N/A        N       N/A  
Self-heal Daemon on node3                   N/A       N/A        Y       1591 
NFS Server on node2                         N/A       N/A        N       N/A  
Self-heal Daemon on node2                   N/A       N/A        Y       11526
 
Task Status of Volume gv1
------------------------------------------------------------------------------
There are no active volume tasks
 

[root@node1 dev]# 


[root@node1 mnt]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wb             -
+core   1    /dev/vdf   Active    -              /dev/intelcas1-1
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3


然后
并将vdf的id替换掉原来坏盘的id

[root@node1 mnt]# vi /etc/intelcas/intelcas.conf 
#  '- '     '  '--  '--'  '--      '--  --' ' --'

# Intel(R) CAS configuration file - for reference on syntax
# of this file please refer to appropriate documentation

# NOTES:
# 1) It is highly recommended to specify cache/core device using path
# that is constant across reboots - e.g. disk device links in
# /dev/disk/by-id/, preferably those using device WWN if available:
#   /dev/disk/by-id/wwn-0x123456789abcdef0
# Referencing devices via /dev/sd* may result in cache misconfiguration after
# system reboot due to change(s) in drive order.

## Caches configuration section
[caches]
## Cache ID     Cache device                            Cache mode      Extra fields (optional)
## Uncomment and edit the below line for cache configuration
#1              /dev/disk/by-id/nvme-INTEL_SSDP..       WT
1               /dev/disk/by-id/virtio-c3349770-d5e3-494d-b     WB

## Core devices configuration
[cores]
## Cache ID     Core ID         Core device
## Uncomment and edit the below line for core configuration
#1              1               /dev/disk/by-id/wwn-0x123456789abcdef0
1               1               /dev/disk/by-id/virtio-49ac23b4-7bc0-4878-a###这里替换
1               2               /dev/disk/by-id/virtio-c8e6629c-d6e5-43ce-9
1               3               /dev/disk/by-id/virtio-7af8271a-500d-463d-8

## To specify use of the IO Classification file, place content of the following line in the
## Caches configuration section under Extra fields (optional)
## ioclass_file=/etc/intelcas/ioclass-config.csv
"/etc/intelcas/intelcas.conf" 42L, 1664C written
[root@node1 mnt]# 



基于IntelCAS加速的Glusterfs常见运维_第1张图片

③ 然后重启intelcas,观察盘符是否飘移

[root@node1 dev]# casadm -L
WARNING: Intel(R) CAS is running on a non-validated OS!
type    id   disk       status    write policy   device
cache   1    /dev/vde   Running   wb             -
+core   1    /dev/vdf   Active    -              /dev/intelcas1-1
+core   2    /dev/vdc   Active    -              /dev/intelcas1-2
+core   3    /dev/vdd   Active    -              /dev/intelcas1-3
[root@node1 dev]# 
发现已经将/dev/vdb  成功替换为  /dev/vdf 
[root@node1 dev]# cd /mnt/
[root@node1 mnt]# ls
10.txt  1.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt
[root@node1 mnt]# 
以上便是简单换盘操作

你可能感兴趣的:(分布式存储,运维,云计算)