Ubuntu AI环境配置

把apt的源换成阿里云或国内其它，速度超快。
vi /etc/apt/sources.list
全部替换

deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse

更换后秒下安装。
sudo apt-get update && sudo apt-get upgrade

一、Cuda10.1安装

驱动已经事先安装，显卡gtx1660.

https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=runfilelocal

1.1下载安装包

image.png

找到对应的版本下载；（用迅雷下载会快不少,下载完传到Ubuntu机器)

image.png

tensorflow 1.13.1只支持到cuda10.0

1.2安装

查看下载文件

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# ls
cuda_10.0.130_410.48_linux.run                  libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
cudnn-10.0-linux-x64-v7.5.1.10.solitairetheme8  libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb
libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb

开始安装；
sudo sh cuda_10.0.130_410.48_linux.run

image.png

协议好长啊，得回车半天。（cuda10.1就改进的很好)

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: n

Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-10.0 ]:  

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is /root ]: 

Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

Installing the CUDA Samples in /root ...
Copying samples to /root/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.0
Samples:  Installed in /root, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-10.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 10.0 functionality to work.
To install the driver using this installer, run the following command, replacing  with the name of this run file:
    sudo .run -silent -driver

Logfile is /tmp/cuda_install_16565.log

提示安装成功；

1.3校验

vi ~/.bashrc
在文件最后加上:

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

命令 source ~/.bashrc 使其生效
查看nvcc -V

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# source ~/.bashrc
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

安装成功；

二、Cudnn7.5.1安装

2.1下载安装包

https://developer.nvidia.com/rdp/cudnn-download
cudnn需要注册登录方可下载；

image.png

下载红框内标记内容；

2.2安装

tar -zxvf cudnn-10.0-linux-x64-v7.5.1.10.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

sudo dpkg -i libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb

执行结果

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# tar -zxvf cudnn-10.0-linux-x64-v7.5.1.10.tgz
1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.debcuda/include/cudnn.h
cuda/NVIDIA_SLA_cuDNN_Support.txt
cuda/lib64/libcudnn.so
cuda/lib64/libcudnn.so.7
cuda/lib64/libcudnn.so.7.5.1
cuda/lib64/libcudnn_static.a
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo cp cuda/include/cudnn.h /usr/local/cuda/include
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libc
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# 
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo dpkg -i libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb
正在选中未选择的软件包 libcudnn7。
(正在读取数据库 ... 系统当前共安装有 168675 个文件和目录。)
正准备解包 libcudnn7_7.5.1.10-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7 (7.5.1.10-1+cuda10.0) ...
正在设置 libcudnn7 (7.5.1.10-1+cuda10.0) ...
正在处理用于 libc-bin (2.27-3ubuntu1) 的触发器 ...
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo dpkg -i libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb
正在选中未选择的软件包 libcudnn7-dev。
(正在读取数据库 ... 系统当前共安装有 168681 个文件和目录。)
正准备解包 libcudnn7-dev_7.5.1.10-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7-dev (7.5.1.10-1+cuda10.0) ...
正在设置 libcudnn7-dev (7.5.1.10-1+cuda10.0) ...
update-alternatives: 使用 /usr/include/x86_64-linux-gnu/cudnn_v7.h 来在自动模式中提供 /usr/include/cudnn.h (libcudnn)
root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# sudo dpkg -i libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb
正在选中未选择的软件包 libcudnn7-doc。
(正在读取数据库 ... 系统当前共安装有 168687 个文件和目录。)
正准备解包 libcudnn7-doc_7.5.1.10-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7-doc (7.5.1.10-1+cuda10.0) ...
正在设置 libcudnn7-doc (7.5.1.10-1+cuda10.0) ...

2.3校验

查看cudnn版本命令
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

root@doyen-ai:/home/software/ubuntu18.04_cuda10.0# cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 1
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

正常运行；

三、tensorflow-gpu安装

3.1查看python环境

root@doyen-ai:/home# python

Command 'python' not found, but can be installed with:

apt install python3       
apt install python        
apt install python-minimal

You also have python3 installed, you can run 'python3' instead.

Ubuntu18.04默认安装了python3.6.8
···
root@doyen-ai:/home# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34)
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.

···

3.2安装pip

apt-get install python3-pip python3-dev

root@doyen-ai:/home# pip3 -V
pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)

再安装setuptools
pip3 install setuptools --upgrade

3.3安装tensorflow-gpu

pip3 install tensorflow-gpu

root@doyen-ai:/home# pip3 install tensorflow-gpu
Collecting tensorflow-gpu
  Downloading https://files.pythonhosted.org/packages/7b/b1/0ad4ae02e17ddd62109cd54c291e311c4b5fd09b4d0678d3d6ce4159b0f0/tensorflow_gpu-1.13.1-cp36-cp36m-manylinux1_x86_64.whl (345.2MB)

Successfully installed absl-py-0.7.1 astor-0.7.1 gast-0.2.2 grpcio-1.20.1 h5py-2.9.0 keras-applications-1.0.7 keras-preprocessing-1.0.9 markdown-3.1 mock-3.0.5 numpy-1.16.3 protobuf-3.7.1 tensorboard-1.13.1 tensorflow-estimator-1.13.0 tensorflow-gpu-1.13.1 termcolor-1.1.0 werkzeug-0.15.4

提示安装完成；

3.4检验安装

root@doyen-ai:/home# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> a = tf.random_normal((100, 100))
>>> b = tf.random_normal((100, 500))
>>> c = tf.matmul(a, b)
>>> sess = tf.InteractiveSession()
2019-05-16 15:57:26.741765: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-05-16 15:57:27.372247: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-16 15:57:27.373652: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1619150 executing computations on platform CUDA. Devices:
2019-05-16 15:57:27.373734: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce GTX 1660, Compute Capability 7.5
2019-05-16 15:57:27.400388: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2904000000 Hz
2019-05-16 15:57:27.401583: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1cdd240 executing computations on platform Host. Devices:
2019-05-16 15:57:27.401651: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): , 
2019-05-16 15:57:27.402011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 1660 major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
totalMemory: 5.80GiB freeMemory: 5.73GiB
2019-05-16 15:57:27.402066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-05-16 15:57:27.405302: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-16 15:57:27.405366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-05-16 15:57:27.405390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-05-16 15:57:27.405581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5567 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660, pci bus id: 0000:01:00.0, compute capability: 7.5)
>>> sess.run(c)
2019-05-16 15:57:45.835122: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
array([[  6.262812 ,  -1.9345528,  10.1873865, ...,   9.533573 ,
         -7.4053297,  -4.2541947],
       [ 10.201033 ,   3.6828916,  -2.0874305, ...,  11.704482 ,
          2.2292233, -12.751171 ],
       [ -4.9506807,  -7.9405203,  11.641254 , ...,  10.210195 ,
         -3.6261683,  -1.245208 ],
       ...,
       [  6.1733346, -11.296464 ,  -6.5138006, ...,  -8.0698185,
         -4.31228  ,   6.034325 ],
       [  8.435815 ,  -6.479247 ,  -1.6091456, ...,   5.5824223,
          5.4707727,  11.140205 ],
       [ -8.973054 , -10.001549 , -15.808032 , ...,  20.240196 ,
          7.126047 ,   9.673972 ]], dtype=float32)
>>>

感觉Ubuntu18.04 gtx1660显卡比win10版本gtx1060显卡速度快很多。

四、安装opencv4.1带cuda应用

4.1安装脚本

安装教程很多，整个脚本自动运行就好试试看。匹配Ubuntu 18.04.
找到cuda相关的显卡算力是6.1，算力地址是：
https://developer.nvidia.com/cuda-gpus
脚本默认下载opencv源码是dev版本。
稳定版请用相关语句替换

curl -L https://github.com/opencv/opencv/archive/4.1.0.zip -o opencv.zip
curl -L https://github.com/opencv/opencv_contrib/archive/4.1.0.zip -o opencv_contrib.zip
unzip opencv.zip 
unzip opencv_contrib.zip 
cd opencv/

installOpenCV-4-on-Ubuntu-18-04.sh

#!/bin/bash
#
if [ "$#" -ne 1 ]; then
    echo "Usage: $0 "
    exit
fi
folder="$1"

echo "** Install requirement"
sudo apt-get update
sudo apt-get install -y build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev pkg-config
sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev
sudo apt-get install -y python2.7-dev python3.6-dev python-dev python-numpy python3-numpy
sudo apt-get install -y libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
sudo apt-get install -y libv4l-dev v4l-utils qv4l2 v4l2ucp 
sudo apt-get install -y curl
sudo apt-get update

echo "** Download opencv-4.1.0"
cd $folder
curl -L https://github.com/opencv/opencv/archive/4.1.0.zip -o opencv-4.1.0.zip
curl -L https://github.com/opencv/opencv_contrib/archive/4.1.0.zip -o opencv_contrib-4.1.0.zip
unzip opencv-4.1.0.zip 
unzip opencv_contrib-4.1.0.zip 
cd opencv-4.1.0/

echo "** Building..."
mkdir release
cd release/
cmake \
  -D CMAKE_BUILD_TYPE=RELEASE \
  -D OPENCV_GENERATE_PKGCONFIG=YES \
  -D CMAKE_INSTALL_PREFIX=/usr/local \
  -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.1.0/modules  \
  -D CUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
  -D CUDA_ARCH_BIN=6.1 \
  -D CUDA_ARCH_PTX="" \
  -D WITH_CUDA=ON \
  -D WITH_TBB=ON \
  -D BUILD_opencv_python3=ON \
  -D BUILD_TESTS=OFF \
  -D BUILD_PERF_TESTS=OFF \
  -D WITH_V4L=ON \
  -D INSTALL_C_EXAMPLES=ON \
  -D INSTALL_PYTHON_EXAMPLES=ON \
  -D BUILD_EXAMPLES=ON \
  -D WITH_OPENGL=ON \
  -D ENABLE_FAST_MATH=1 \
  -D CUDA_FAST_MATH=1 \
  -D WITH_CUBLAS=1 \
  -D WITH_NVCUVID=ON \
  -D WITH_GSTREAMER=ON \
  -D WITH_OPENCL=YES \
  -D WITH_QT=ON \
  -D BUILD_opencv_cudacodec=OFF ..

make -j8
sudo make install
echo "** Install opencv-4.1.0 successfully"
echo "** Bye :)"

如果碰到下载不下来的，可以先下载然后改相关的路径重新cmake即可。

image.png

4.2测试python opencv4

root@doyen-ai:/home/software/opencv/build# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> print(cv2.__version__)
4.1.0-dev

五、安装pytorch1.1带cuda10.0

https://pytorch.org/get-started/locally/

image.png

pip3 install https://download.pytorch.org/whl/cu100/torch-1.1.0-cp36-cp36m-linux_x86_64.whl
pip3 install torchvision

root@doyen-ai:/home/software# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>>

正常加载；

六、安装mxnet

http://mxnet.incubator.apache.org

image.png

cuda10.0
需要用
pip3 install mxnet-cu100

root@doyen-ai:/home/software# python3
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> mx.__version__
'1.4.1'
>>> a = mx.nd.ones((2, 3), mx.gpu())
>>> b  = a*2+1
>>> b
[[3. 3. 3.]
 [3. 3. 3.]]

>>>

全文完
（折腾10个小时左右,cuda10.1不支持tensorflow1.13，重新安装系统花费时间较长，编译opencv4花费时间较长)。

Ubuntu AI环境配置

一、Cuda10.1安装

1.1下载安装包

tensorflow 1.13.1只支持到cuda10.0

1.2安装

1.3校验

二、Cudnn7.5.1安装

2.1下载安装包

2.2安装

2.3校验

三、tensorflow-gpu安装

3.1查看python环境

3.2安装pip

3.3安装tensorflow-gpu

3.4检验安装

四、安装opencv4.1带cuda应用

4.1安装脚本

4.2测试python opencv4

五、安装pytorch1.1带cuda10.0

六、安装mxnet

你可能感兴趣的:(Ubuntu AI环境配置)