一、前言
(一)、由来
主要是为了实现以下目的
- 独立:不同用户的环境相互独立,不能互相访问,且可以同时使用。(唯一通道是共享文件夹)
- 隔离:用户是不可以访问宿主机的(唯一通道是共享文件夹)。
- 自由:用户可以像使用一台自己的Linux机器一样,方便访问,拥有最大的权限,自有安装程序、自由访问网络等等。(事实上,这种操作方式与购买GPU云主机的体验基本等同)
- GPU:最最重要的是每位用户都能使用GPU资源。
- 可控:虽说要自由,但必要时候(同时使用的人太多),还是要能够限制每位用户的资源的(CPU,内存,GPU这些)
(二)、 环境
我用的环境都是最新的,大家觉得不稳定的话,可以考虑往后退一点。
- kfctl v0.6.2-rc.2
- ksonnet version: 0.13.1
- k8s v1.15.3
- docker 19.03.1
- NVIDIA Docker: 2.0.3
机器有三台,一台拿来当master, 另外两台有GPU,用来做node.
IP Address | Role | CPU | Memory | System | GPU |
---|---|---|---|---|---|
192.168.1.112 | master | 2 | 8 | Ubuntu 16.04 | none |
192.168.1.113 | node1 | 2 | 8 | Ubuntu 16.04 | 1070ti |
192.168.1.114 | node2 | 2 | 8 | Ubuntu 16.04 | 1070ti |
二、安装k8s
(一).先安装环境
保存下面的文件为setup.sh, 在三台机子上都要执行。
#!/bin/sh
#docker 仓库
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial edge" | sudo tee /etc/apt/sources.list.d/docker.list
#k8s 仓库
curl -s "https://packages.cloud.google.com/apt/doc/apt-key.gpg" | sudo apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
# nvidia-docker2 仓库
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# cuda 仓库
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-ubuntu1604.pin
sudo mv cuda-ubuntu1604.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/ /"
#安装nvidia驱动 master这段可以去掉,不去掉也行,反正安不上
wget http://cn.download.nvidia.com/XFree86/Linux-x86_64/430.40/NVIDIA-Linux-x86_64-430.40.run
chmod 777 NVIDIA-Linux-x86_64-430.40.run
./NVIDIA-Linux-x86_64-430.40.run
apt-get update
#开始安装 cuda docker nvidia-docker2
apt-get -y install cuda docker nvidia-docker2 nfs-common
# 修改cgroups, 修改docker的runtime类型
echo "{"default-runtime": "nvidia","runtimes": {"nvidia": {"path": "/usr/bin/nvidia-container-runtime","runtimeArgs": []}},"exec-opts": ["native.cgroupdriver=systemd"]}">/etc/docker/daemon.json
systemctl restart docker
#安装k8s全家桶
sudo swapoff -a
sudo apt-get install -y kubelet kubeadm kubectl
bash setup.sh
- 过程比较漫长,中间还涉及几次交互。
- 大家注意一下,我的安装环境是国外的,大家如果在国内,考虑把上述的一些google源替换掉。
- 执行脚本的目录下会多出几个文件,脚本里没删掉,大家看看就好。
最后要验证一下docker能够调用到显卡执行docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
root@ubuntuNode1:/home/ubuntu# docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Sat Aug 24 16:13:45 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 107... Off | 00000000:0B:00.0 Off | N/A |
| 24% 48C P8 7W / 180W | 0MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(二). 链接三台机器成为集群
master上执行下述代码,初始化该机器为master
kubeadm init --service-cidr 10.96.0.0/12 --pod-network-cidr 10.244.0.0/16 \ #这里的ip尽量别动
--apiserver-advertise-address 192.168.1.112 \ #ip可以换成每个环境自己的
--ignore-preflight-errors=SystemVerification \ #k8s会验证版本的,加了这一行能强制安装未经过验证的版本
--token f6ncoz.loxuwn6pp5187ev4
执行完上述这段代码,会提示在其他node执行kubeadm join操作。我这边的如下:
kubeadm join 192.168.1.112:6443 --token f6ncoz.loxuwn6pp5187ev4 \
--discovery-token-ca-cert-hash sha256:eb0c7962fd8b2328ea38aa0f003186a8ace1c5af8b15dc1fa1e34054745a5bba \
--ignore-preflight-errors=SystemVerification #k8s会验证版本的
在两个node上都要执行。
(三). 部署flannel插件
1.master上执行下述代码,插件就部署成功了。
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
2.创建配置文件
mkdir ~/.kube && cp /etc/kubernetes/admin.conf ~/.kube/config kubectl get node
3.验证,都ready了就OK了。
kubectl get node
root@ubuntuCpu:/home/ubuntu# kubectl get node
NAME STATUS ROLES AGE VERSION
ubuntucpu Ready master 22h v1.15.3
ubuntunode1 Ready 22h v1.15.3
ubuntunode2 Ready 22h v1.15.3
root@ubuntuCpu:/home/ubuntu#
(四). 部署Nvidia-docker-plus插件
这个插件主要是用来让k8s使用显卡的,部署起来也简单,在master上执行下述代码。
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml
验证 kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
结果可以看到我的两个node上各有一个GPU。
root@ubuntuCpu:/home/ubuntu# kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
NAME GPU
ubuntucpu
ubuntunode1 1
ubuntunode2 1
root@ubuntuCpu:/home/ubuntu#
三、部署kubeflow
(一). 概述
在笔者研究kubeflow这一个月里,kubeflow的教程更新了好多次,足以见kubeflow部署之复杂,问题之多。工具版本如下:
- kfctl v0.6.2-rc.2
- ksonnet version: 0.13.1
- k8s v1.15.3
- docker 19.03.1
- NVIDIA Docker: 2.0.3
笔者按照官方的教程,第一次成功部署,教程地址:https://www.kubeflow.org/docs/started/k8s/kfctl-existing-arrikto
(二).部署MetalLB
这些内容默认是被折叠的,但是其实很重要
1.部署插件
kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.8.1/manifests/metallb.yaml
2.MetalLB配置
这里大家要注意一下,下面的IP地址,为MetalLB分配的IP地址池,我是将该IP的范围改成与集群同一网段的IP范围,至少要有1个IP。
cat <
3.验证MetalLB的安装,这个步骤笔者不赘述,大家看官方文档。
(三).设置default StorageClass
1.搭建nfs服务器 参考文章
安装(在哪一台上都行,笔者是选择在master安装)
sudo apt install nfs-kernel-server
配置
sudo vi /etc/exports
添加下述内容/home/nfs4 *(rw,sync,no_root_squash)
,这个配置了nfs的路径,大家要确保这个路径的存储量足够大,最好能到100G以上,因为后面的安装每个PVC都是10G往上。最后文件的样子:
# /etc/exports: the access control list for filesystems which may be exported
# to NFS clients. See exports(5).
#
# Example for NFSv2 and NFSv3:
# /srv/homes hostname1(rw,sync,no_subtree_check) hostname2(ro,sync,no_subtree_check)
#
# Example for NFSv4:
# /srv/nfs4 gss/krb5i(rw,sync,fsid=0,crossmnt,no_subtree_check)
# /srv/nfs4/homes gss/krb5i(rw,sync,no_subtree_check)
#
/home/nfs4 *(rw,sync,no_root_squash)
重启服务
sudo /etc/init.d/nfs-kernel-server restart
2.创建测试的静态PV和PVC 参考文章
首先,需要利用静态的PV和PVC来测试一下NFS系统能否正常工作:
pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: mypv1
spec:
capacity:
storage: 4Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
nfs:
path: [已配置的NFS系统的路径]
server: [已配置的NFS系统的IP地址]
pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mypvc1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
执行kubectl create -f pv.yaml
, kubectl create -f pvc.yaml
后,再执行kubectl get pvc
看到bound了说明绑定了。
root@ubuntuCpu:/home/ubuntu# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
test-claim2 Bound pvc-db5c180d-8d9c-417e-a637-0de15d88f521 1Mi RWO nfs 10h
创建二者后,如果能够自动绑定,说明NFS系统工作正常,这样才能执行下面的步骤。
3.运行nfs-client-provisioner
这一步有一个前提条件,就是所有的nodes上都要安装nfs-common,因为docker是用nfs-common去挂载。
想要动态生成PV,需要运行一个NFS-Provisioner服务,将已配置好的NFS系统相关参数录入,并向用户提供创建PV的服务。官方推荐使用Deployment运行一个replica来实现,当然也可以使用Daemonset等其他方式,这些都在官方文档中提供了。
在创建Deployment之前,一定要按照官方文档中的Step 3部分配置相关的内容。(这部分是参考文章中的问文字,笔者是没有进行这个操作的~~)
编写rbac.yaml文件如下:
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-provisioner
subjects:
- kind: ServiceAccount
name: nfs-provisioner
namespace: default
roleRef:
kind: ClusterRole
name: nfs-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-provisioner
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-provisioner
subjects:
- kind: ServiceAccount
name: nfs-provisioner
# replace with namespace where provisioner is deployed
namespace: default
roleRef:
kind: Role
name: leader-locking-nfs-provisioner
apiGroup: rbac.authorization.k8s.io
编写serviceaccount.yaml文件如下:
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-provisioner
注意,针对已配置好的NFS系统和自己搭建NFS系统,Deployment中使用的镜像不同!这里踩了一个大坑,在转为使用现成系统后没有修改原来的yaml文件中的镜像,导致持续报错,调试了好长时间才意识到问题。
编写deployment.yaml文件如下:
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: nfs-provisioner
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: nfs-provisioner
spec:
serviceAccount: nfs-provisioner
containers:
- name: nfs-provisioner
image: registry.cn-hangzhou.aliyuncs.com/open-ali/nfs-client-provisioner
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: example.com/nfs
- name: NFS_SERVER
value: [已配置的NFS系统的IP地址]
- name: NFS_PATH
value: [已配置的NFS系统的挂载路径]
volumes:
- name: nfs-client-root
nfs:
server: [已配置的NFS系统的IP地址]
path: [已配置的NFS系统的挂载路径]
注意,官方文档提供的镜像在国内无法正常下载,在网上找到了一个阿里云的镜像作为替代。参考这里
这个镜像中volume的mountPath默认为/persistentvolumes,不能修改,否则运行时会报错。
创建后观察Pod能否正常运行。后面如果出现错误,可以用kubectl logs查看这个Pod的日志来查看错误,进行调试。
4.创建StorageClass
编写并创建storageclass.yaml如下:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: nfs
provisioner: example.com/nfs
给这个StorageClass标记为默认,不然kubeflow不会调用这个StorageClass,执行kubectl patch storageclass nfs -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
5.创建测试claim
接下来要创建测试的claim,以检测StorageClass能否正常工作:
编写并创建test-claim.yaml如下,注意storageClassName应确保与上面创建的StorageClass名称一致。
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-claim1
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
storageClassName: nfs
创建后,用kubectl get pvc查看,观察新创建的PVC能够自动绑定PV。
(四)kubeflow部署
1.ksconnect
wget https://github.com/ksonnet/ksonnet/releases/download/v0.13.1/ks_0.13.1_linux_amd64.tar.gz
tar -xaf ks_0.13.1_linux_amd64.tar.gz
cp ks_0.13.1_linux_amd64/ks /usr/bin/ks
2.kfctl
wget https://github.com/kubeflow/kubeflow/releases/download/v0.6.2-rc.2/kfctl_v0.6.2-rc.2_linux.tar.gz
tar -xaf kfctl_v0.6.2-rc.2_linux.tar.gz
cp kfctl_v0.6.2-rc.2_linux/kfctl /usr/bin/kfctl
3.安装kubeflow了
由于上面两步我把kfctl 跟ks都拷贝进bin里面了,这里官网的export PATH=$PATH:"
我就忽略了。
export KFAPP="/home/ubuntu/kfapp"
export CONFIG="https://raw.githubusercontent.com/kubeflow/kubeflow/v0.6.1/bootstrap/config/kfctl_existing_arrikto.0.6.yaml"
# Specify credentials for the default user.
export KUBEFLOW_USER_EMAIL="[email protected]"
export KUBEFLOW_PASSWORD="12341234"
kfctl init ${KFAPP} --config=${CONFIG} -V
cd ${KFAPP}
kfctl generate all -V
kfctl apply all -V
印象中kfctl apply all -V
的过程中,到最后的时候会提示说有一个namespace没有创建,这个时候可以另外开一个命令行执行kubectl create ns
就可以了。
4.验证
都在running就没问题了,后续就是登录kubeflow创建jupyterNotebook, 来验证GPU的使用了。
root@ubuntuCpu:/home/ubuntu# kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
admission-webhook-bootstrap-stateful-set-0 1/1 Running 0 11h
admission-webhook-deployment-b77bd65c5-dfhhd 1/1 Running 0 11h
argo-ui-6db54c878-df5pz 1/1 Running 0 11h
centraldashboard-54c456fc46-9s7nl 1/1 Running 0 11h
dex-6847f88df6-wzcsj 1/1 Running 0 11h
jupyter-web-app-deployment-6f96544f6f-pck6t 1/1 Running 0 11h
katib-controller-55ccdcc6c8-2nf5v 1/1 Running 0 11h
katib-db-b48df7777-wbf5s 1/1 Running 0 11h
katib-manager-6944b56f96-4hm45 1/1 Running 1 11h
katib-manager-rest-6f6b8f4b54-mzbjv 1/1 Running 0 11h
katib-suggestion-bayesianoptimization-66c6764d5b-7mwls 1/1 Running 0 11h
katib-suggestion-grid-5c758dbf4b-rfvjh 1/1 Running 0 11h
katib-suggestion-hyperband-76cdd95f46-q5ddx 1/1 Running 0 11h
katib-suggestion-nasrl-6bc7855ddd-wgklg 1/1 Running 0 11h
katib-suggestion-random-65c489b584-4shxg 1/1 Running 0 11h
katib-ui-57bcbb9f56-d67bf 1/1 Running 0 11h
metacontroller-0 1/1 Running 0 11h
metadata-db-8d9b95598-6x2tc 1/1 Running 0 11h
metadata-deployment-545d79c747-64777 1/1 Running 3 11h
metadata-deployment-545d79c747-kzj5q 1/1 Running 3 11h
metadata-deployment-545d79c747-x4sf6 1/1 Running 3 11h
metadata-ui-76b5498765-766sg 1/1 Running 0 11h
minio-56dc668bd-z7hrp 1/1 Running 0 11h
ml-pipeline-567b7d6b44-qklkn 1/1 Running 0 11h
ml-pipeline-persistenceagent-69f558486c-7lsbh 1/1 Running 0 11h
ml-pipeline-scheduledworkflow-869954f57c-v86km 1/1 Running 0 11h
ml-pipeline-ui-c8d7b55cc-tdhmq 1/1 Running 0 11h
ml-pipeline-viewer-controller-deployment-566d875695-5p85f 1/1 Running 0 11h
mysql-75654987c5-hbpml 1/1 Running 0 11h
notebook-controller-deployment-58c6c5d8cc-58z6q 1/1 Running 0 11h
profiles-deployment-84f47f6c9b-9k69z 2/2 Running 0 11h
pytorch-operator-69d875b748-tw42m 1/1 Running 0 11h
spartakus-volunteer-6cfc55fd88-tkdx5 1/1 Running 0 11h
tensorboard-5f685f9d79-48rw7 1/1 Running 0 11h
tf-job-dashboard-5fc794cc7c-z7vld 1/1 Running 0 11h
tf-job-operator-6c9674bcd8-tsrbq 1/1 Running 0 11h
workflow-controller-5b4764bc47-lkcht 1/1 Running 0 11h