torch在ubuntu16.04下的搭建(cuda9.0+cudnn7.0)

希望外婆身体越来越好

参考:
http://blog.csdn.net/chenhaifeng2016/article/details/68957732
http://www.52nlp.cn/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E4%B8%BB%E6%9C%BA%E7%8E%AF%E5%A2%83%E9%85%8D%E7%BD%AE-ubuntu-16-04-nvidia-gtx-1080-cuda-8

目录

  • 安装nvidia驱动
  • 安装cuda
  • 安装cudnn
  • 安装torch

安装nvidia驱动

ubuntu 16.04默认安装了第三方开源的驱动程序nouveau,安装nvidia显卡驱动首先需要禁用nouveau,不然会碰到冲突的问题,导致无法安装nvidia显卡驱动。
编辑文件blacklist.conf

sudo vim /etc/modprobe.d/blacklist.conf

在文件最后部分插入以下两行内容

blacklist nouveau
options nouveau modeset=0

更新系统

sudo update-initramfs -u

重启系统(一定要重启)

验证nouveau是否已禁用,未显示信息表示禁用成功

lsmod | grep nouveau

禁用图形界面

sudo service lightdm stop

重启进入bios界面,禁止掉secure-boot
安装下载好的驱动(官网下载)

sh ./NVIDIA-linu-64_384.90.run
reboot
service lightdm restart

安装cuda(官网下载)

1)下载cuda(官网)

sudo sh cuda_9.0._linux.run 

**执行后会有一系列提示让你确认,非常非常非常非常关键的地方是是否安装361这个低版本的驱动:

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?

答案必须是n,否则之前安装的GTX1080驱动就白费了,而且问题多多。**
2) 安装所缺少的库

apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

5) 验证安装是否完成

nvidia-smi
nvcc –V

显示:

nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2016
NVIDIA Corporation Built on Wed_May__4_21:01:56_CDT_2016 Cuda
compilation tools, release 8.0, V8.0.26

6) 测试cuda的samples

cd ‘/home/zhou/NVIDIA_CUDA-8.0_Samples’
make

安装cudnn7(官网下载)


tar -zxvf cudnn-9.0-linux-x64-v7.0-tar.gz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

验证
cuda的samples 里面有个deviceQuery 运行之后会显示信息,最后一行出行pass说明成功啦~~~~

torch安装(官网“>http://torch.ch/docs/getting-started.html#)

torch的cudnn安装

git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && luarocks make cudnn-scm-1.rockspec

遇到如下错误:

lib/THC/CMakeFiles/THC.dir/build.make:4243: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o] Error 1
/pkgbuild/torch/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(393): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

/pkgbuild/torch/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(414): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

解决方法:

export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"

错误

 no CUDA-capable device is detected at /tmp/luarocks_cutorch-scm-1-5456/cutorch/lib/THC/THCGeneral.c:70

解决:驱动重新安装

测试:

th
require “cudnn”
cudnn.benchmark = true 
cudnn.fastest = true
cudnn.verbose = true #bydeault set to false

torch-resnet下载地址
https://github.com/facebook/fb.resnet.torch/tree/master/pretrained#fine-tuning-on-a-custom-dataset
改用自己数据集参考:
http://coldmooon.github.io/2017/03/20/train_on_custom_datasets_in_fb.resnet.torch_/
resenet主要层源码:
https://github.com/gcr/torch-residual-networks/blob/master/residual-layers.lua

你可能感兴趣的:(#,torch)