参考:
http://blog.csdn.net/chenhaifeng2016/article/details/68957732
http://www.52nlp.cn/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E4%B8%BB%E6%9C%BA%E7%8E%AF%E5%A2%83%E9%85%8D%E7%BD%AE-ubuntu-16-04-nvidia-gtx-1080-cuda-8
ubuntu 16.04默认安装了第三方开源的驱动程序nouveau,安装nvidia显卡驱动首先需要禁用nouveau,不然会碰到冲突的问题,导致无法安装nvidia显卡驱动。
编辑文件blacklist.conf
sudo vim /etc/modprobe.d/blacklist.conf
在文件最后部分插入以下两行内容
blacklist nouveau
options nouveau modeset=0
更新系统
sudo update-initramfs -u
重启系统(一定要重启)
验证nouveau是否已禁用,未显示信息表示禁用成功
lsmod | grep nouveau
禁用图形界面
sudo service lightdm stop
重启进入bios界面,禁止掉secure-boot
安装下载好的驱动(官网下载)
sh ./NVIDIA-linu-64_384.90.run
reboot
service lightdm restart
1)下载cuda(官网)
sudo sh cuda_9.0._linux.run
**执行后会有一系列提示让你确认,非常非常非常非常关键的地方是是否安装361这个低版本的驱动:
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?
答案必须是n,否则之前安装的GTX1080驱动就白费了,而且问题多多。**
2) 安装所缺少的库
apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
5) 验证安装是否完成
nvidia-smi
nvcc –V
显示:
nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2016
NVIDIA Corporation Built on Wed_May__4_21:01:56_CDT_2016 Cuda
compilation tools, release 8.0, V8.0.26
6) 测试cuda的samples
cd ‘/home/zhou/NVIDIA_CUDA-8.0_Samples’
make
tar -zxvf cudnn-9.0-linux-x64-v7.0-tar.gz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
验证
cuda的samples 里面有个deviceQuery 运行之后会显示信息,最后一行出行pass说明成功啦~~~~
torch的cudnn安装
git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && luarocks make cudnn-scm-1.rockspec
遇到如下错误:
lib/THC/CMakeFiles/THC.dir/build.make:4243: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o] Error 1
/pkgbuild/torch/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(393): error: more than one operator "==" matches these operands:
function "operator==(const __half &, const __half &)"
function "operator==(half, half)"
operand types are: half == half
/pkgbuild/torch/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(414): error: more than one operator "==" matches these operands:
function "operator==(const __half &, const __half &)"
function "operator==(half, half)"
operand types are: half == half
解决方法:
export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
错误
no CUDA-capable device is detected at /tmp/luarocks_cutorch-scm-1-5456/cutorch/lib/THC/THCGeneral.c:70
解决:驱动重新安装
测试:
th
require “cudnn”
cudnn.benchmark = true
cudnn.fastest = true
cudnn.verbose = true #bydeault set to false
torch-resnet下载地址
https://github.com/facebook/fb.resnet.torch/tree/master/pretrained#fine-tuning-on-a-custom-dataset
改用自己数据集参考:
http://coldmooon.github.io/2017/03/20/train_on_custom_datasets_in_fb.resnet.torch_/
resenet主要层源码:
https://github.com/gcr/torch-residual-networks/blob/master/residual-layers.lua