redis高可用采用的是哨兵(sentinel),多个redis-slave配备了多个哨兵进程,哨兵监控redis-master,一旦出现故障,将一台slave提升为master。客户端通过连接哨兵来获取Redis的master地址,发生故障,哨兵会报告新的服务器地址。
2.1 一个哨兵认为master不可用,此时被仍为主观不可用,当有指定个数的哨兵都认为master不可用,此时状态进入客观不可用,进入主备切换流程。
2.2 进入主备切换流程后,需要一定个数的哨兵都同意进行进行主备切换授权,此时才真正开始进行主备切换。
2.3 开始进行主备切换的时候,一个sentinel被授权, 获得挂掉的master的最新配置版本号,主备切换后,该版本号用于最新配置。
2.4 一个sentinel成功对master进行主备切换,会把最新配置通过广播形式高速其他sentinel,其他sentinel则更新对应master配置。
2.5 当将一个slave选举为master并发送命令后,即使其他slave还没有针对新master重新配置自己,主备切换也被认为是成功的,所有sentinels将会发布新的配置信息。
一个相互通信的sentinel集群最终会采用版本号最高且相同的配置。
sentinel利用master的发布/订阅机制自动发现其他的sentinel节点
每个sentinel向每个master和slave发布/订阅频道 __sentinel__:hello 每秒发送一次消息来宣布存在,每个sentinel订阅每个master和slave的频道__sentinel__:hello的内容来发现未知sentinel,检测新的sentinel,则加入自身维护的master列表。
每个sentinel发送的消息中包含其当前维护的最新master配置,如果某个sentinel发现自己配置版本低于接收到的配置版本,则用新配置更新自己的master配置。
slave选举考虑以下几方面:
1) 与master断开连接的次数
2) Slave的优先级
3)数据复制下标(评估slave当前拥有多少master数据)
4)进程id
候选人规则:
1) slaves优先级越小排名越靠前
2)优先级相同,看复制下标,哪个从master接收的复制数据多,就越靠前。
3)优先级和下标相同,选择进程ID较小的哪个。
步骤1:连接到第一个Sentinel
客户端需要遍历Sentinel列表的地址。对于每个地址,它需要使用一个较短的超时时间来尝试连接到Sentinel。如果发生错误或者超时,下一个Sentinel地址将会被尝试。
如果所有的Sentinel地址都尝试了都不能成功的话,一个错误会被返回给客户端。
第一个应答client的sentinel应该被放在列表的头部,这样在下一次重连的时候,我们将首先尝试这个sentinel是否可达来最小化延迟。
步骤2: 请求master地址
一旦与Sentinel的连接建立后,客户端应该在sentinel中尝试去执行下面的命令:
SENTINEL get-master-addr-by-name master-name
注意: 请将上述master-name替换为真实的master名称。
返回的结果是包含两部分:
一个ip:port的二元组。
如果接收到一个 ip:port,该地址就是被用于连接到redis master的地址。否则,如果接收到的是一个空的应答,客户端需要尝试列表中的下一个sentinel。
样例结果如下:
bash-4.4$ redis-cli -p 26379
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "10.233.65.74"
2) "6379"
步骤3: 在目标实例上调用ROLE命令
一旦客户端发现了master的地址,它应该尝试与master建立一次连接,并且调用ROLE命令来验证该连接的实例是否真的是一个master。
如果该连接的实例不是master,客户端应该等待一个较短的时间,然后继续从步骤1开始执行。
处理重连
一旦服务名称被解析为master的地址,一个与redis master的连接建立后,每次当需要重新连接时,客户端应该使用sentinels从步骤1再次解析地址.
总结:
redis-ha解析获取master 地址的原理:
步骤1: 获取sentinel的ip:port组成的列表,遍历该列表,
1.1 如果当前sentinel的ip:port可以连接到该sentinel,
然后就发送命令:sentinel get-master-addr-by-name
来获取master的ip地址,并执行步骤2
1.2 否则,继续遍历下一个sentinel的ip:port
步骤2: 建立与master的连接,并执行一个ROLE命令来判定连接的是否真的是master,
2.1 如果是master,保持与master的连接。
2.2 否则,转步骤1
如果现主备切换等需要重新连接的情况,会再次从步骤1开始执行,直到解析出master的地址。
下面是社区镜像地址
https://quay.io/repository/smile/redis?tag=latest&tab=tags
注意: 社区使用的镜像是:
quay.io/smile/redis:4.0.11-r1
具体地址:
https://github.com/helm/charts/tree/master/stable/redis-ha
注意: 该代码必须是基于statefulset的redis-ha而不是原来基于deployment的redis-ha
3.1 在redis-ha的values.yaml中redis.config下添加一行
protected-mode: "no"
具体修改后的样例如下所示:
redis:
port: 6379
masterGroupName: mymaster
config:
## Additional redis conf options can be added below
## For all available options see http://download.redis.io/redis-stable/redis.conf
min-slaves-to-write: 1
min-slaves-max-lag: 5 # Value in seconds
maxmemory: "0" # Max memory to use for each redis instance. Default is unlimited.
maxmemory-policy: "volatile-lru" # Max memory policy to use for each redis instance. Default is volatile-lru.
# Determines if scheduled RDB backups are created. Default is false.
# Please note that local (on-disk) RDBs will still be created when re-syncing with a new slave. The only way to prevent this is to enable diskless replication.
save: "900 1"
# When enabled, directly sends the RDB over the wire to slaves, without using the disk as intermediate storage. Default is false.
repl-diskless-sync: "yes"
rdbcompression: "yes"
rdbchecksum: "yes"
protected-mode: "no"
解释:
redis没有bind和密码的情况下,保护模式开启,拒绝其他sentinel连接,会导致主备切换不成功。所以要解除保护模式。即进行上述操作。
参考:
https://blog.csdn.net/csdn_ds/article/details/72550898
3.2 在redis-ha的redis-ha-configmap.yaml中
sentinel.conf的
dir "/data"
下添加如下内容:
{{- $protectedMode := index .Values.redis.config "protected-mode" }}
protected-mode {{ $protectedMode }}
具体修改后的样例如下所示:
sentinel.conf: |
{{- if .Values.sentinel.customConfig }}
{{ .Values.sentinel.customConfig | indent 4 }}
{{- else }}
dir "/data"
{{- $protectedMode := index .Values.redis.config "protected-mode" }}
protected-mode {{ $protectedMode }}
{{- $root := . -}}
3.3 根据需要是否修改persistentVolume的storageClass
默认是:
persistentVolume:
enabled: true
## redis-ha data Persistent Volume Storage Class
## If defined, storageClassName:
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, AWS & OpenStack)
##
# storageClass: "-"
可以查看当前环境上支持的storageclass,然后设置可用的值。如果不设置,有可能导致PV挂不上,最终导致redis-ha安装失败。
例如:
[root@node-1 11_6_redis_ha]# kubectl get storageclass
NAME PROVISIONER AGE
ceph-ssd ceph.com/rbd 4d
general ceph.com/rbd 4d
解释:
Persistent Volume:
含义:持久卷,是网络存储,不属于noe和pod,但每个Node上可以访问模式: ReadWriteOnce,读写权限,并且只能被单个Node挂载
适用: 需要先定义PersistentVolumeClaim(持久卷声明,是一个申请)
StorageClass: 标记存储资源的特性和性能,可将存储资源定义为某种类别(Class)
参考: kubernetes权威指南
helm install --name redis-ha redis-ha --namespace openstack
注意: name后面的 redis-ha (即Release Name)请不要修改为其他名字,否则会导致gnocchi对接redis-ha失败。
具体的原因是redis-ha中redis-ha/templates/redis-ha-service.yaml这个文件中定义的service的名称是一个变量,
具体如下:
apiVersion: v1
kind: Service
metadata:
name: {{ template "redis-ha.fullname" . }}
而在redis-ha/templates/_helpers.tpl中定义的 redis-ha.fullname 这个模板如下:
{{- define "redis-ha.fullname" -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
分析可知: 上述redis-ha-sentinel服务的真正名字是按照:
Release.Name-redis-ha
现在 Release.Name 是由 helm install 命令中 --name 后面指定的名称来确定的,而gnocchi依赖这个服务,需要知道这个服务的具体名字,而不能是一个变量所以,这里指定 :
Release.Name
是 redis-ha
这样gnocchi的charts中指定依赖服务的时候就可以指定具体redis-ha-sentinel的真正名字是
redis-ha-redis-ha
了,这样两边保持一致,才能确保gnocchi和redis-ha的对接才能成功
5.1 验证redis-ha pod启动成功
[root@node-1 ark]# kubectl get pods -n openstack -o wide|grep redis-ha
redis-ha-redis-ha-server-0 2/2 Running 0 3h 10.233.66.144 node-3
redis-ha-redis-ha-server-1 2/2 Running 0 3h 10.233.65.74 node-1
redis-ha-redis-ha-server-2 2/2 Running 0 3h 10.233.64.234 node-2
5.2 验证PV创建成功
[root@node-1 ark]# kubectl get pv|grep redis
pvc-7fc11ced-e236-11e8-a479-fa163e93c106 10Gi RWO Delete Bound openstack/data-redis-ha-redis-ha-server-0 general 5h
pvc-c080baa7-e194-11e8-a479-fa163e93c106 10Gi RWO Delete Bound openstack/data-redis-ha-redis-ha-server-2 general 1d
pvc-e79fcc8e-e193-11e8-a479-fa163e93c106 10Gi RWO Delete Bound openstack/data-redis-ha-redis-ha-server-1 general 1d
5.3 验证redis-ha的statefulset生成
[root@node-1 ark]# kubectl get statefulset -n openstack|grep redis
redis-ha-redis-ha-server 3 3 5h
5.4 验证redis-ha的service生成
[root@node-1 ark]# kubectl get svc -n openstack|grep redis-ha
redis-ha-redis-ha ClusterIP None
1.1进入任意一个redis的sentinel的容器中
即执行如下命令:
kubectl exec -it -n openstack redis-ha-redis-ha-server-0 -c sentinel /bin/bash
1.2 登录到redis-sentinel中
即执行如下命令:
redis-cli -p 26379
1.3 获取master的地址
即只新如下命令:
sentinel get-master-addr-by-name mymaster
输出样例如下:
1) "10.233.65.74"
2) "6379"
1.4 确定主master的地址对应的pod
即执行如下命令
kubectl get pods -n openstack -o wide|grep redis-ha|grep 10.233.65.74
注意: 请将 10.233.65.74 替换为1.3中获得的主master的ip地址
输出样例如下:
redis-ha-redis-ha-server-1 2/2 Running 0 5h 10.233.65.74 node-1
2.1 进入到redis master对应的redis容器
根据上述步骤1.4中 确定了redis master对应的pod后,进入该pod中的redis容器,即执行如下命令:
kubectl exec -it -n openstack redis-ha-redis-ha-server-1 -c redis /bin/bash
2.2 模拟redis-ha hang住
即执行如下命令:
redis-cli -p 6379 DEBUG sleep 30
2.3 观察redis-master对应的pod中的sentinel容器的日志
即执行如下命令:
kubectl logs -n openstack redis-ha-redis-ha-server-1 -c sentinel --tail=300 -f --timestamps
可以观察到有类似如下日志输出:
2018-11-07T03:08:00.877670363Z 1:X 07 Nov 03:08:00.877 # +switch-master mymaster 10.233.66.144 6379 10.233.65.74 6379
2018-11-07T03:08:00.877676242Z 1:X 07 Nov 03:08:00.877 * +slave slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T03:08:00.877679125Z 1:X 07 Nov 03:08:00.877 * +slave slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T03:08:10.896177687Z 1:X 07 Nov 03:08:10.895 # +sdown slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T03:08:19.612363788Z 1:X 07 Nov 03:08:19.611 # -sdown slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:18.049986807Z 1:X 07 Nov 08:19:18.049 # +sdown master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.11869511Z 1:X 07 Nov 08:19:18.115 # +odown master mymaster 10.233.65.74 6379 #quorum 2/2
2018-11-07T08:19:18.118808386Z 1:X 07 Nov 08:19:18.115 # +new-epoch 2
2018-11-07T08:19:18.118816965Z 1:X 07 Nov 08:19:18.115 # +try-failover master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.167892561Z 1:X 07 Nov 08:19:18.166 # +vote-for-leader 22cf8e8f535db85cd2f1a886991d325f64cfabf7 2
2018-11-07T08:19:18.35194575Z 1:X 07 Nov 08:19:18.351 # f96014b78eb3d4a95bf796a6098c1af001ceb03f voted for 22cf8e8f535db85cd2f1a886991d325f64cfabf7 2
2018-11-07T08:19:18.383625857Z 1:X 07 Nov 08:19:18.383 # f9c78c9fb4676bf6353040ea4bdeb23b36a542d8 voted for 22cf8e8f535db85cd2f1a886991d325f64cfabf7 2
2018-11-07T08:19:18.408188307Z 1:X 07 Nov 08:19:18.407 # +elected-leader master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.408255617Z 1:X 07 Nov 08:19:18.407 # +failover-state-select-slave master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.484098943Z 1:X 07 Nov 08:19:18.483 # +selected-slave slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:18.48415193Z 1:X 07 Nov 08:19:18.483 * +failover-state-send-slaveof-noone slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 63792018-11-07T03:08:00.877670363Z 1:X 07 Nov 03:08:00.877 # +switch-master mymaster 10.233.66.144 6379 10.233.65.74 6379
2018-11-07T03:08:00.877676242Z 1:X 07 Nov 03:08:00.877 * +slave slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T03:08:00.877679125Z 1:X 07 Nov 03:08:00.877 * +slave slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T03:08:10.896177687Z 1:X 07 Nov 03:08:10.895 # +sdown slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T03:08:19.612363788Z 1:X 07 Nov 03:08:19.611 # -sdown slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:18.049986807Z 1:X 07 Nov 08:19:18.049 # +sdown master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.11869511Z 1:X 07 Nov 08:19:18.115 # +odown master mymaster 10.233.65.74 6379 #quorum 2/2
2018-11-07T08:19:18.118808386Z 1:X 07 Nov 08:19:18.115 # +new-epoch 2
2018-11-07T08:19:18.118816965Z 1:X 07 Nov 08:19:18.115 # +try-failover master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.167892561Z 1:X 07 Nov 08:19:18.166 # +vote-for-leader 22cf8e8f535db85cd2f1a886991d325f64cfabf7 2
2018-11-07T08:19:18.35194575Z 1:X 07 Nov 08:19:18.351 # f96014b78eb3d4a95bf796a6098c1af001ceb03f voted for 22cf8e8f535db85cd2f1a886991d325f64cfabf7 2
2018-11-07T08:19:18.383625857Z 1:X 07 Nov 08:19:18.383 # f9c78c9fb4676bf6353040ea4bdeb23b36a542d8 voted for 22cf8e8f535db85cd2f1a886991d325f64cfabf7 2
2018-11-07T08:19:18.408188307Z 1:X 07 Nov 08:19:18.407 # +elected-leader master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.408255617Z 1:X 07 Nov 08:19:18.407 # +failover-state-select-slave master mymaster 10.233.65.74 6379
2018-11-07T08:19:18.484098943Z 1:X 07 Nov 08:19:18.483 # +selected-slave slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:18.48415193Z 1:X 07 Nov 08:19:18.483 * +failover-state-send-slaveof-noone slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:18.561209047Z 1:X 07 Nov 08:19:18.560 * +failover-state-wait-promotion slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:19.227931971Z 1:X 07 Nov 08:19:19.227 # +promoted-slave slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:19.227989503Z 1:X 07 Nov 08:19:19.227 # +failover-state-reconf-slaves master mymaster 10.233.65.74 6379
2018-11-07T08:19:19.297924611Z 1:X 07 Nov 08:19:19.297 * +slave-reconf-sent slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:19.49242546Z 1:X 07 Nov 08:19:19.492 # -odown master mymaster 10.233.65.74 6379
2018-11-07T08:19:20.311034165Z 1:X 07 Nov 08:19:20.310 * +slave-reconf-inprog slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:20.311104628Z 1:X 07 Nov 08:19:20.310 * +slave-reconf-done slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:20.381345989Z 1:X 07 Nov 08:19:20.380 # +failover-end master mymaster 10.233.65.74 6379
2018-11-07T08:19:20.381442475Z 1:X 07 Nov 08:19:20.380 # +switch-master mymaster 10.233.65.74 6379 10.233.66.144 6379
2018-11-07T08:19:20.38146031Z 1:X 07 Nov 08:19:20.381 * +slave slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.66.144 6379
2018-11-07T08:19:20.381466083Z 1:X 07 Nov 08:19:20.381 * +slave slave 10.233.65.74:6379 10.233.65.74 6379 @ mymaster 10.233.66.144 6379
2018-11-07T08:19:30.427201725Z 1:X 07 Nov 08:19:30.426 # +sdown slave 10.233.65.74:6379 10.233.65.74 6379 @ mymaster 10.233.66.144 6379
2018-11-07T08:19:37.567222124Z 1:X 07 Nov 08:19:37.566 # -sdown slave 10.233.65.74:6379 10.233.65.74 6379 @ mymaster 10.233.66.144 6379
2018-11-07T08:19:18.561209047Z 1:X 07 Nov 08:19:18.560 * +failover-state-wait-promotion slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:19.227931971Z 1:X 07 Nov 08:19:19.227 # +promoted-slave slave 10.233.66.144:6379 10.233.66.144 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:19.227989503Z 1:X 07 Nov 08:19:19.227 # +failover-state-reconf-slaves master mymaster 10.233.65.74 6379
2018-11-07T08:19:19.297924611Z 1:X 07 Nov 08:19:19.297 * +slave-reconf-sent slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:19.49242546Z 1:X 07 Nov 08:19:19.492 # -odown master mymaster 10.233.65.74 6379
2018-11-07T08:19:20.311034165Z 1:X 07 Nov 08:19:20.310 * +slave-reconf-inprog slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:20.311104628Z 1:X 07 Nov 08:19:20.310 * +slave-reconf-done slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.65.74 6379
2018-11-07T08:19:20.381345989Z 1:X 07 Nov 08:19:20.380 # +failover-end master mymaster 10.233.65.74 6379
2018-11-07T08:19:20.381442475Z 1:X 07 Nov 08:19:20.380 # +switch-master mymaster 10.233.65.74 6379 10.233.66.144 6379
2018-11-07T08:19:20.38146031Z 1:X 07 Nov 08:19:20.381 * +slave slave 10.233.64.234:6379 10.233.64.234 6379 @ mymaster 10.233.66.144 6379
2018-11-07T08:19:20.381466083Z 1:X 07 Nov 08:19:20.381 * +slave slave 10.233.65.74:6379 10.233.65.74 6379 @ mymaster 10.233.66.144 6379
2018-11-07T08:19:30.427201725Z 1:X 07 Nov 08:19:30.426 # +sdown slave 10.233.65.74:6379 10.233.65.74 6379 @ mymaster 10.233.66.144 6379
2018-11-07T08:19:37.567222124Z 1:X 07 Nov 08:19:37.566 # -sdown slave 10.233.65.74:6379 10.233.65.74 6379 @ mymaster 10.233.66.144 6379
2.4 验证master已经修改
1)进入任意一个redis的sentinel的容器中
即执行如下命令:
kubectl exec -it -n openstack redis-ha-redis-ha-server-0 -c sentinel /bin/bash
2)登录到redis-sentinel中
即执行如下命令:
redis-cli -p 26379
3) 获取master的地址
即只新如下命令:
sentinel get-master-addr-by-name mymaster
输出样例如下:
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "10.233.66.144"
2) "6379"
而原来的master地址是
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "10.233.65.74"
2) "6379"
证明主备切换已经成功
测试主备参考:
https://redis.io/topics/sentinel
由于gnocchi在对接redis的代码时,主要时根据redis_url进行解析的,具体的代码在:
gnocchi/storage/common/redis.py中里面定义了如何实例化redis的客户端。具体代码如下:
def get_client(conf):
if redis is None:
raise RuntimeError("python-redis unavailable")
parsed_url = parse.urlparse(conf.redis_url)
options = parse.parse_qs(parsed_url.query)
kwargs = {}
if parsed_url.hostname:
kwargs['host'] = parsed_url.hostname
if parsed_url.port:
kwargs['port'] = parsed_url.port
else:
if not parsed_url.path:
raise ValueError("Expected socket path in parsed urls path")
kwargs['unix_socket_path'] = parsed_url.path
if parsed_url.password:
kwargs['password'] = parsed_url.password
for a in CLIENT_ARGS:
if a not in options:
continue
if a in CLIENT_BOOL_ARGS:
v = strutils.bool_from_string(options[a][-1])
elif a in CLIENT_LIST_ARGS:
v = options[a][-1]
elif a in CLIENT_INT_ARGS:
v = int(options[a][-1])
else:
v = options[a][-1]
kwargs[a] = v
if 'socket_timeout' not in kwargs:
kwargs['socket_timeout'] = CLIENT_DEFAULT_SOCKET_TO
# Ask the sentinel for the current master if there is a
# sentinel arg.
if 'sentinel' in kwargs:
sentinel_hosts = [
tuple(fallback.split(':'))
for fallback in kwargs.get('sentinel_fallback', [])
]
sentinel_hosts.insert(0, (kwargs['host'], kwargs['port']))
sentinel_server = sentinel.Sentinel(
sentinel_hosts,
socket_timeout=kwargs['socket_timeout'])
sentinel_name = kwargs['sentinel']
del kwargs['sentinel']
if 'sentinel_fallback' in kwargs:
del kwargs['sentinel_fallback']
master_client = sentinel_server.master_for(sentinel_name, **kwargs)
# The master_client is a redis.StrictRedis using a
# Sentinel managed connection pool.
return master_client
return redis.StrictRedis(**kwargs)
分析上述代码可知:
步骤1: gnocchi解析配置的redis url, 提取所有sentinel信息
步骤2: 将上述列表用于Sentinel的初始化
步骤3: 将步骤2获取的Sentinel对象向主redis名称进行请求来获得主redis客户端,并返回
因此: 原来redis_url是:
redis://redis.openstack.svc.cluster.local:6379/
如果要和redis-ha对接,经过修改之后,最后对接的redis_url是:
redis://redis-ha-redis-ha.openstack.svc.cluster.local:26379?sentinel=mymaster
解释:
之所以需要sentinel=mymaster的原因是:
参考gnocchi的文档:
https://gnocchi.xyz/stable_4.2/install.html#configuration-file
# redis://
# sentinel_fallback=
# sentinel_fallback=
# sentinel_fallback=
结合gnocchi上述代码,需要知道redis-ha中主redis的名字,而主redis的名字是mymaster,所以需要上述配置。
修改gnocchi的charts中的values.yaml
endpoints:
redis:
name: redis-ha-redis-ha
hosts:
default: redis-ha-redis-ha
public: redis-ha-redis-ha
host_fqdn_override:
default: null
path:
default: null
scheme:
default: 'redis'
port:
api:
default: 26379
public: 80
主要是将redis修改为redis-ha-redis-ha
这个redis-ha-redis-ha是redis-ha的服务名称,可以通过执行如下命令查看:
[root@node-1 ark]# kubectl get svc -n openstack|grep redis-ha
redis-ha-redis-ha ClusterIP None
更新上述gnocchi中的redis_url为:
redis://redis-ha-redis-ha.openstack.svc.cluster.local:26379?sentinel=mymaster
最后升级gnocchi。
1)先创建一台虚机
即执行如下命令
nova flavor-create min10 1308 64 1 1
nova boot --image b1ffcdc5-efe5-4db8-8771-b7d538f07e50 --flavor 1308 --nic net-id=7ec650d5-82ca-4128-a3a9-d75625d7bec9 chen
注意:
上述:
b1ffcdc5-efe5-4db8-8771-b7d538f07e50 是一个image id,可以通过 glance image-list获取
7ec650d5-82ca-4128-a3a9-d75625d7bec9 是一个net id,可以通过 neutron net-list获取
请替换为真实的id
2) 查看gnocchi对应的虚机资源信息
gnocchi resource show 1dc3d157-e1d0-4f6a-8fc9-fcc7ab64bf8e
+-----------------------+-------------------------------------------------------------------+
| Field | Value |
+-----------------------+-------------------------------------------------------------------+
| created_by_project_id | f43f4dd82c2040ca99778fe730f2b933 |
| created_by_user_id | 3ff16e7409c94265b97c0594aca5d228 |
| creator | 3ff16e7409c94265b97c0594aca5d228:f43f4dd82c2040ca99778fe730f2b933 |
| ended_at | None |
| id | 1dc3d157-e1d0-4f6a-8fc9-fcc7ab64bf8e |
| metrics | cpu.delta: 42353e7b-ca36-427b-893e-aba26eb8292c |
| | cpu_util: 6399da04-75ec-42a2-bf91-40380da86bef |
| | disk.read.bytes.rate: 73c49751-01a2-4e9a-8ab9-7ec978b344e6 |
| | disk.read.requests.rate: 10417fe9-b230-45f6-8086-2ad098495ebb |
| | disk.write.bytes.rate: 3ddc7ed5-619b-4afe-8543-b218756ff878 |
| | disk.write.requests.rate: c79ff681-acaa-45af-90bf-d1f7a9a70d69 |
| | disks.total: 2295f8a8-771e-4c1a-9f08-28f38f3098e0 |
| | disks.used: 6f2b4305-1d03-455a-99b2-b899e2f351b0 |
| | disks.util: 63994336-39fe-48b7-bab8-f5159d84f0fd |
| | memory.usage: 3395574a-d03c-4983-97ed-e9e84982b36a |
| | memory.util: 50206a11-d857-4013-a9e6-633f30ae1e81 |
| original_resource_id | 1dc3d157-e1d0-4f6a-8fc9-fcc7ab64bf8e |
| project_id | d460d9ea2b2744c38e1077d32588650d |
| revision_end | None |
| revision_start | 2018-10-25T12:54:21.879170+00:00 |
| started_at | 2018-10-25T12:32:30.361941+00:00 |
| type | instance |
| user_id | 6b1f0a4d2e3541008bb511d9dd2018bd |
+-----------------------+-------------------------------------------------------------------+
注意: 请将1dc3d157-e1d0-4f6a-8fc9-fcc7ab64bf8e 替换为真实的虚机id
3) 查看gnocchi虚机资源对应的监控项的监控数据
gnocchi measures show 6399da04-75ec-42a2-bf91-40380da86bef
+---------------------------+-------------+-------+
| timestamp | granularity | value |
+---------------------------+-------------+-------+
| 2018-10-25T00:00:00+00:00 | 86400.0 | 0.0 |
| 2018-10-25T12:00:00+00:00 | 7200.0 | 0.0 |
| 2018-10-25T12:45:00+00:00 | 900.0 | 0.0 |
| 2018-10-25T12:55:00+00:00 | 300.0 | 0.0 |
+---------------------------+-------------+-------+
注意: 请将 6399da04-75ec-42a2-bf91-40380da86bef 替换为监控项的id
上述有监控数据,表明redis-ha对接成功
先执行主备切换,然后验证上述2中是否可以获取新的监控数据
如果是直接初始化redis的客户端,那么样例代码如下:
import redis
r = redis.StrictRedis(host='localhost', port=6379, db=0)
r.set('foo', 'bar')
r.get('foo')
关于sentinel的支持
redis-py可以用于发现redis的节点。需要至少保证有一个sentinel守护进程运行。可以从sentinel实例中创建redis 客户端连接,可以连接到master。样例代码如下:
from redis.sentinel import Sentinel
sentinel = Sentinel([('localhost', 26379)], socket_timeout=0.1)
master = sentinel.master_for('mymaster', socket_timeout=0.1)
master.set('foo', 'bar')
这个master对象是普通的StrictRedis实例,该实例绑定了连接池到sentinel实例上。当一个sentinel后端客户端尝试去建立连接,它先查询sentinel服务器来确定一个连接的正确的主机。如果没有找到server,一个MasterNotFoundError将会抛出。
参考:
[1] https://github.com/helm/charts/tree/master/stable/redis-ha
[2] https://blog.csdn.net/csdn_ds/article/details/72550898
[3] kubernetes权威指南
[4] https://redis.io/topics/sentinel
[5] https://segmentfault.com/a/1190000002680804
[6] https://segmentfault.com/a/1190000002685515
[7] https://www.kubernetes.org.cn/3974.html
[8] http://www.cnblogs.com/S-tec-songjian/p/9354828.html
[9] https://www.cnblogs.com/S-tec-songjian/p/9365921.html
[10] https://pypi.org/project/redis/#description