基于NVIDIA-Docker的Tensorflow-GPU环境搭建

本文部分内容摘录:http://www.cnblogs.com/xuxinkun/p/5983633.html


Docker安装

Docker安装过程需要使用root权限, 主要有两种安装方式:

1、apt,yum方式安装,需要配置apt或yum的仓库

2、脚本安装:curl安装

阿里云加速器设置

针对1.10以上版本,您可以通过修改daemon配置文件/etc/docker/daemon.json来使用加速器:

sudo mkdir -p /etc/docker

sudo tee /etc/docker/daemon.json <<-‘EOF’

{

“registry-mirrors”: [“https://fird1mfg.mirror.aliyuncs.com“]

}

EOF

sudo systemctl daemon-reload

sudo systemctl restart docker

其中[“https://fird1mfg.mirror.aliyuncs.com“]为阿里云申请的加速镜像地址

NVIDIA-Docker安装

Prerequisties:

    GNU/Linuxx86_64 with kernel version > 3.10

    Docker >= 1.9 (officialdocker-engine,docker-ceordocker-eeonly)

    NVIDIA GPU with Architecture > Fermi (2.1)

    NVIDIA drivers >= 340.29 with binarynvidia-modprobe(驱动版本与CUDA计算能力相关)

CUDA与NVIDIA driver安装:

    处理NVIDIA-Docker依赖项NVIDIA drivers >= 340.29 with binarynvidia-modprobe要求.

    根据显卡,下载对应版本的CUDA并进行安装.

NVIDIA-Docker安装:

    #Install nvidia-docker and nvidia-docker-plugin

   #Test nvidia-smi

   先运行插件  

   nohup nvidia-docker-plugin &

   再在容器里启动nvidia-smi

   nvidia-docker run –rm nvidia/cuda nvidia-smi

   备注:可能由于selinux原因,上述命令出现找不到路径问题,改成下面命令:

    nvidia-docker run -ti --rm --privileged=true nvidia/cuda nvidia-smi

    或:

     docker run -ti `curl -s http://localhost:3476/v1.0/docker/cli` --rm --privileged=true nvidia/cuda nvidia-smi


Tensorflow镜像使用

tensorflow gpu支持

tensorflow gpu in docker

docker可以通过提供gpu设备到容器中。nvidia官方提供了nvidia-docker的一种方式,其用nvidia-docker的命令行代替了docker的命令行来使用GPU。

nvidia-docker run -it -p8888:8888 gcr.io/tensorflow/tensorflow:latest-gpu

这种方式对于docker侵入较多,因此nvidia还提供了一种nvidia-docker-plugin的方式。其使用流程如下:

首先在宿主机启动nvidia-docker-plugin:

[root@A01-R06-I184-22nvidia-docker]# ./nvidia-docker-plugin

./nvidia-docker-plugin |2016/10/1000:01:12Loading NVIDIA unified memory

./nvidia-docker-plugin |2016/10/1000:01:12Loading NVIDIA management library

./nvidia-docker-plugin |2016/10/1000:01:17Discovering GPU devices

./nvidia-docker-plugin |2016/10/1000:01:18Provisioning volumes at/var/lib/nvidia-docker/volumes

./nvidia-docker-plugin |2016/10/1000:01:18Serving plugin API at/run/docker/plugins

./nvidia-docker-plugin |2016/10/1000:01:18Serving remote API at localhost:3476

可以看到nvidia-docker-plugin监听了3486端口。然后在宿主机上运行docker run -ti curl -s http://localhost:3476/v1.0/docker/cli-p 8890:8888 gcr.io/tensorflow/tensorflow:latest-gpu /bin/bash命令以创建tensorflow的GPU容器。并可以在容器中验证是否能正常import tensorflow。

[root@A01-R06-I184-22~]# docker run -ti `curl -s http://localhost:3476/v1.0/docker/cli` -p 8890:8888 gcr.io/tensorflow/tensorflow:latest-gpu 

/bin/bashroot@7087e1f99062:/notebooks# python

Python2.7.6(default, Jun222015,17:58:13) [GCC4.8.2] on linux2

Type"help","copyright","credits"or"license"formore information.>>> import tensorflow

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1locally

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally>>>

nvidia-docker-plugin工作原理

是其提供了一个API

curl -s http://localhost:3476/v1.0/docker/cli

--volume-driver=nvidia-docker --volume=nvidia_driver_352.39:/usr/local/nvidia:ro--device=/dev/nvidiactl--device=/dev/nvidia-uvm --device=/dev/nvidia0--device=/dev/nvidia1 --device=/dev/nvidia2 --device=/dev/nvidia3

你可能感兴趣的:(基于NVIDIA-Docker的Tensorflow-GPU环境搭建)