前言

Kubernetes集群中，基于已有的透传型GPU虚机，部署一个GPU-Node，与常规节点相比需要增加三个步骤：

安装NVIDIA-Driver
安装NVIDIA-Docker2
部署NVIDIA-Device-Plugin

安装NVIDIA-driver

下载 NVIDIA 驱动

驱动都是免费的，根据显卡型号选择下载合适的驱动，官方驱动下载地址

NVIDIA 驱动程序下载.png

禁用 nouveau 驱动

添加conf 文件：

vi /etc/modprobe.d/blacklist.conf

在最后两行添加：

blacklist nouveau
options nouveau modeset=0

重新生成 kernel initramfs：

执行sudo update-initramfs -u

重启节点虚机：

reboot

验证：没输出代表禁用生效，在重启之后执行

lsmod | grep nouveau

禁用 nouveau 生效.png

安装驱动

示例中：虚机操作系统为 Ubuntu18.04-amd64，显卡型号为 Tesla-V100，安装驱动版本选择 440.33.01

在线安装：

apt install nvidia-driver-430 nvidia-utils-430 nvidia-settings

离线安装：

./NVIDIA-Linux-x86_64-{{ gpu_version }}.run -s

验证驱动安装：

nvidia-smi

正确安装驱动后，输出示例如下：

nvidia-smi.png

以上完成NVIDIA驱动安装

安装NVIDIA-docker2

由于18.06版本的docker不支持GPU容器, 需要安装NVIDIA-Docker2以支持容器使用NVIDIA-GPUs
注意：安装之前要先安装好 docker及NVIDIA驱动，但不需要安装 CUDA。

在线安装：

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -

distribution= $(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list

apt-get update

apt-get install -y nvidia-docker2

systemctl restart docker

离线安装：
在通外网的机器上，运行以下命令：

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ sudo apt-get update

下载5个包

apt download libnvidia-container1

apt download libnvidia-container-tools

apt download nvidia-container-toolkit

apt download nvidia-container-runtime

apt download nvidia-docker2

将下载好的包拷贝到目标节点虚机后，执行如下命令进行安装

dpkg -i libnvidia-container1_1.0.7-1_amd64.deb && dpkg -i libnvidia-container-tools_1.0.7-1_amd64.deb && dpkg -i nvidia-container-toolkit_1.0.5-1_amd64.deb && dpkg -i nvidia-container-runtime_3.1.4-1_amd64.deb && dpkg -i nvidia-docker2_2.2.2-1_all.deb

设置GPU节点的docker default runtime 为 nvidia-container-runtime

vi /etc/docker/daemon.json

需要在该配置文件中添加的内容如下:

{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

重启docker

systemctl restart docker

验证安装

docker info

docker-info.png

以上完成NVIDIA-docker2安装

安装插件 nvidia-device-plugin-daemonset

注:示例中版本为1.0.0-beta6，可去Nvidia Github 项目下查看所有可用版本

在线安装:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin:1.0.0-beta6/nvidia-device-plugin.yml

Nvidia官方manifest/清单为:

# Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
 
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nvidia-device-plugin-daemonset
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: nvidia-device-plugin-ds
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      # This annotation is deprecated. Kept here for backward compatibility
      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ""
      labels:
        name: nvidia-device-plugin-ds
    spec:
      tolerations:
      # This toleration is deprecated. Kept here for backward compatibility
      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
      - key: CriticalAddonsOnly
        operator: Exists
      - key: nvidia.com/gpu
        operator: Exists
        effect: NoSchedule
      # Mark this pod as a critical add-on; when enabled, the critical add-on
      # scheduler reserves resources for critical add-on pods so that they can
      # be rescheduled after a failure.
      # See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
      priorityClassName: "system-node-critical"
      containers:
      - image: nvidia/k8s-device-plugin:1.0.0-beta6
        name: nvidia-device-plugin-ctr
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]
        volumeMounts:
          - name: device-plugin
            mountPath: /var/lib/kubelet/device-plugins
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins

注: 可以给GPU节点添加label，并在nvidia-device-plugin-daemonset.yaml中添加nodeselector或nodeAffinity。

以上步骤完成后，验证GPU-Node安装

kubectl get no {nodeName} -oyaml

GPU-node.png

图中Node详情中已经可以取到nvidia.com/gpu的值了，此时GPU资源是以显卡个数暴露给Kubernetes集群的，说明配置生效了，另外，也可以显存形式暴露给Kubernetes集群并实现GPU共享调度。
以上完成在Kubernetes集群中GPU节点的部署及验证

Reference

安装NVIDIA驱动： https://www.cnblogs.com/youpeng/p/10887346.html
禁用Nouveau：http://www.iewb.net/index.php/qg/3717.html
NVIDIA驱动官方下载地址：https://www.nvidia.cn/Download/index.aspx?lang=cn
安装NVIDIA-docker2： https://fanfuhan.github.io/2019/11/22/docker_based_use/
解决 Ubuntu18 无法安装 Nvidia-docker2： https://blog.csdn.net/wuzhongli/article/details/86539433
kubernetes官方文档： https://kubernetes.io/zh/docs/tasks/manage-gpus/scheduling-gpus/
NVIDIA device plugin 官方文档： https://github.com/NVIDIA/k8s-device-plugin

Kubernetes集群中GPU节点的部署及验证

前言