首先声明, 这个是我在看kubernetes指南时,根据书本做实验时遇到的问题。 有经验的大佬请随便看一下,因为这并不一定是解决你的问题的方法。
这个问题烦了我一下午,国内没有搜到解决这个问题的方法。科学上网后参考其他人贴出的一些方法,解决了此问题。这里贴出我的解决方法和思路。
搭建过程就不说了, 网上有很多。
这是我的topology.json
{
"clusters": [
{
"nodes": [
{
"node": {
"hostnames": {
"manage": [
"k8s-slave-0"
],
"storage": [
"192.168.96.129"
]
},
"zone": 1
},
"devices": [
"/dev/sdb"
]
},
{
"node": {
"hostnames": {
"manage": [
"k8s-slave-1"
],
"storage": [
"192.168.96.130"
]
},
"zone": 1
},
"devices": [
"/dev/sdb"
]
},
{
"node": {
"hostnames": {
"manage": [
"k8s-slave-2"
],
"storage": [
"192.168.96.131"
]
},
"zone": 1
},
"devices": [
"/dev/sdb"
]
}
]
}
]
}
Creating cluster ... ID: 7675c678602c6907d4c6c259b74f732e
Allowing file volumes on cluster.
Allowing block volumes on cluster.
Creating node k8s-slave-0 ... Unable to create node: New Node doesn't have glusterd running
Creating node k8s-slave-1 ... Unable to create node: New Node doesn't have glusterd running
Creating node k8s-slave-2 ... Unable to create node: New Node doesn't have glusterd running
查看 pod日志 kubectl logs deploy-heketi-68d4457cd-2wzfz -f 。
日志显示无法获取pod list
[kubeexec] ERROR 2019/01/20 13:24:12 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: pods is forbidden: User "system:serviceaccount:default:heketi-service-account" cannot list resource "pods" in API group "" in the namespace "default"
查阅资料发现, heketi需要对k8s集群做一些操作, 而这些操作需要放权, 书上仅添加ServiceAccount是不够的,还需要添加对应的role。
创建role 并绑定到 ServiceAccount
kubectl create clusterrole foo --verb=get,list,watch --resource=pods,pods/status,pods/exec
再次执行gluster添加命令,观察日志
[kubeexec] ERROR 2019/01/20 13:27:12 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [systemctl status glusterd] on [pod:glusterfs-rfslk c:glusterfs ns:default (from host:k8s-slave-0 selector:glusterfs-node)]: Err[pods "glusterfs-rfslk" is forbidden: User "system:serviceaccount:default:heketi-service-account" cannot create resource "pods/exec" in API group "" in the namespace "default"]: Stdout []: Stderr []
日志中提示 cannot create resource “pods/exec” , 尝试添加create权限
kubectl create clusterrole foo --verb=get,list,watch,create --resource=pods,pods/status,pods/exec
再次执行gluster添加命令, 成功添加
[root@deploy-heketi-68d4457cd-2wzfz heketi]# heketi-cli topology load --json=topology.json
Creating cluster ... ID: b78e39219263f7838f58a5652275ab34
Allowing file volumes on cluster.
Allowing block volumes on cluster.
Creating node k8s-slave-0 ... ID: c82ddde257c4a6fa61c01969ff82b77a
Adding device /dev/sdb ... OK
Creating node k8s-slave-1 ... ID: c1fc07eb7f25fe9e1f10f55fb15ae01d
Adding device /dev/sdb ... OK
Creating node k8s-slave-2 ... ID: a350e7a4055785789844fdb685ad8536
Adding device /dev/sdb ... OK
查看 该pod绑定的 pvc事件 , 显示 failed to create volume: failed to create volume: sed: can’t read /var/lib/heketi/fstab: No such file or directory 需要pv的pod一直申请不成功, 查看 该pod绑定的 pvc事件 , 显示 failed to create volume: failed to create volume: sed: can’t read /var/lib/heketi/fstab: No such file or directory
failed to create volume: failed to create volume: sed: can't read /var/lib/heketi/fstab: No such file or directory
解决: 导致这个问题的根本原因没有找到, 我的解决方法是添加提示确实的文件,即/var/lib/heketi/fstab。 注意 ,这个文件是添加到部署 glusterfs的机器上, 不是heketi的容器中。
#到heketi的容器中
kuberctl exec -ti heketi bash
#执行申请卷的命令,查看报错
[root@deploy-heketi-68d4457cd-8q742 heketi]# heketi-cli volume create --size 1
Error: /usr/sbin/modprobe failed: 1
thin: Required device-mapper target(s) not detected in your kernel.
Run `lvcreate --help' for more information.
提示缺少必要的模块,手动为所有部署 glusterfs的节点加载模块
[root@k8s-slave-0 heketi-data]# modprobe dm_thin_pool
[root@k8s-slave-0 heketi-data]# lsmod | grep thin
dm_thin_pool 69632 0
dm_persistent_data 69632 1 dm_thin_pool
dm_bio_prison 20480 1 dm_thin_pool
dm_mod 126976 9 dm_thin_pool,dm_log,dm_mirror,dm_bufio
再次观察发现已经正常。 (参考这篇文章,https://www.jianshu.com/p/8ede72534a69)
查看sidecar的日志,报错如下
'failed to connect to server [10.44.0.3:27017] on first connect [MongoError: connect ECONNREFUSED 10.44.0.3:27017]' }
修改mongodb的 statefulset 文件, 将mongodb的启动命令加上 监听所有ip的参数
containers:
- name: mongo
image: mongo
command:
- mongod
- "--replSet"
- rs0
- "--smallfiles"
- "--noprealloc"
- "--bind_ip_all"
再次查看,问题解决。