【大数据平台】——Docker+Conda3+Tensorflow1.15+Google模型+远程Jupyter

基于Docker的远程TensorFlow的Jupyter环境搭建

本文所解决的是

大部分Docker的远程TensorFlow环境的搭建,都是围绕着Tensorflow这个镜像运作的,这个镜像的好处是安装简单,大体步骤就是“Nvidia/CUDA >> Nvidia-Docker2 >> Tensorflow-xx-xx-...”。缺点是:

--- 没有Anaconda环境,安装就要考虑新的conda环境怎样使用镜像中的Tensorflow

--- 默认的Jupyter Notebook很蠢,想配置个远程访问密码,呵呵呵

--- 如果你像我一样需要Google模型,也是,呵呵呵

所以考虑一下,为什么不可以“Nvidia/CUDA >> Nvidia-Docker2 >> Anaconda”之后 pip 安装呢?

安装Docker及显卡驱动

docker安装网上一大把,不做赘述,根据自己的操作系统版本选择。

推荐一个教程

至于显卡驱动,先确认是个N卡,其次支持GPU运算,剩下的就是官网走一波了。

安装Nvidia-Docker

官方教程

  • Ubuntu 16.04/18.04, Debian Jessie/Stretch/Buster

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker
  • CentOS 7 (docker-ce), RHEL 7.4/7.5 (docker-ce), Amazon Linux 1/2

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo

$ sudo yum install -y nvidia-container-toolkit
$ sudo systemctl restart docker

 

安装nvidia/cuda

1. 首先配置国内镜像

$ curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io

2. 查看需要安装的版本

2.2  查看显卡型号(1080)

$ lspci | grep -i nvidia

01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b80 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 10f0 (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b80 (rev a1)
02:00.1 Audio device: NVIDIA Corporation Device 10f0 (rev a1)

2.3  查看驱动有效及版本(OK)

$ nvidia-smi

Tue Jan 14 17:43:43 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0 Off |                  N/A |
| 29%   24C    P0    38W / 200W |      0MiB /  8117MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 00000000:02:00.0 Off |                  N/A |
| 30%   23C    P0    36W / 200W |      0MiB /  8119MiB |      5%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

2.4  查看cuDNN的版本(7)

$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 3
--
#define CUDNN_VERSION    (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

3. 安装

综上,推得我要安装的镜像为 nvidia/cuda:10.0-cudnn7-devel-centos7(系统任选,但是会影响与下文的dockerfile)

 

安装我们的Dockerfile

接下来就是基于nvidia/cuda:10.0-cudnn7-devel-centos7安装镜像,这里我已经将安装文件整理好了。

https://pan.baidu.com/s/1UDBINARdtqvS3BMUhnsrdQ     提取码:2auu

在宿主机上新建目录/.../install和/.../DockerDir

将四个安装文件和Dockerfile放进/.../install,进入该目录。

$ docker build -t tf-gpu . 

这是写好的Dockerfile,需要修改的只有jupyter的访问密码

### 基础镜像 
FROM nvidia/cuda:10.0-cudnn7-devel-centos7


### 声明变量
# Anaconda安装文件
ARG conda_install=Anaconda3-2019.10-Linux-x86_64.sh

# TensorFlow安装文件
ARG tensorflow_install=tensorflow_gpu-1.15.0-cp37-cp37m-manylinux2010_x86_64.whl

# Google模型安装文件
ARG models_insatll=models-master.zip

# Protoc模型安装文件
ARG protoc_insatll=protobuf-python-3.11.2.tar.gz
ARG protoc_dir=protobuf-3.11.2

# jupyter 配置
ARG jupyter_password=6789@jkl


### 更新/安装基础工具
RUN yum install -y unzip 
RUN yum install -y zip 
RUN yum install -y vim 
RUN yum install -y wget
RUN yum install -y gcc
RUN yum install -y automake
RUN yum install -y autoconf
RUN yum install -y libtool
RUN yum install -y make


### 安装Anaconda3
COPY ${conda_install} /root/

RUN /usr/bin/bash /root/${conda_install} -b -p /usr/local/anaconda3

ENV CONDA_HOME=/usr/local/anaconda3
ENV PATH=${CONDA_HOME}/bin:$PATH


### 安装tensorflow-gpu-1.15.0
COPY ${tensorflow_install} /root/

RUN ${CONDA_HOME}/bin/pip install /root/${tensorflow_install}


### 安装Google模型
COPY ${models_insatll} /root/

RUN /usr/bin/unzip /root/${models_insatll} -d /usr/local/
RUN mv /usr/local/models-master /usr/local/models
RUN cd ${CONDA_HOME}/lib/python3.7/site-packages; \
	echo "/usr/local/models/research" >> tensorflow_model.pth; \
	echo "/usr/local/models/research/slim" >> tensorflow_model.pth


### 安装Protoc
COPY ${protoc_insatll} /root/

RUN cd /root/; \
	/usr/bin/tar -zxvf ${protoc_insatll};
RUN cd /root/${protoc_dir}; \
	./configure --prefix=/usr/local/protobuf; \
	make; \
	make install; 

ENV PROTOC_HOME=/usr/local/protobuf
ENV PATH=${PROTOC_HOME}/bin:$PATH
ENV PKG_CONFIG_PATH=/usr/local/protobuf/lib/pkgconfig/

RUN echo "/usr/local/protobuf/lib" >> /etc/ld.so.conf; \
	ldconfig;


### 配置jupyter
RUN ${CONDA_HOME}/bin/jupyter notebook --generate-config
RUN cd /root/; \
	echo "from notebook.auth import passwd" >> get_pw.py; \
	echo "pwd = passwd('${jupyter_password}')" >> get_pw.py; \
	echo "print(pwd)" >> get_pw.py
RUN cd /root/.jupyter/; \ 
	echo "c.NotebookApp.allow_remote_access = True" >> jupyter_notebook_config.py; \
	echo "c.NotebookApp.allow_root = True" >> jupyter_notebook_config.py; \
	echo "c.NotebookApp.ip = '*'" >> jupyter_notebook_config.py; \
	echo "c.NotebookApp.open_browser = False" >> jupyter_notebook_config.py; \
	echo "c.NotebookApp.password = u'`${CONDA_HOME}/bin/python /root/get_pw.py`'" >> jupyter_notebook_config.py; \
	echo "c.NotebookApp.password_required = True" >> jupyter_notebook_config.py; \
	echo "c.NotebookApp.port = 5555" >> jupyter_notebook_config.py; \
	echo "c.NotebookApp.quit_button = False" >> jupyter_notebook_config.py 

启动

一切就绪启动命令如下:(绑定目录可以改、映射端口可以改)

nvidia-docker run -i \
        -p 5555:5555 \
        -v /.../DockerDir:/LocalDir \
        tf-gpu:latest \
        /bin/bash -c "jupyter notebook" \
        > tf-jupyter.log 2>&1 &

直接访问宿主机 IP:5555 成了!

你可能感兴趣的:(大数据平台应用)