目的:解决版本不匹配的问题。
参考:https://blog.csdn.net/weixin_36474809/article/details/87804903
目录
一、当前版本查看
1.1 查看CUDA驱动版本
1.2 查看base environment中CUDA运行版本
1.3 查看当前虚拟环境中CUDA版本
二、安装对应
2.1 安装
2.2 检验
-
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.835
-
pciBusID: 0000:83:00.0
-
totalMemory: 7.92GiB freeMemory: 1.96GiB
-
2019-02-20 20:17:15.278289: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
-
Traceback (most recent call last):
-
File
"config.py", line 214,
in
-
args.func(args)
-
File
"config.py", line 147,
in train
-
dnnlib.submission.submit.submit_run(submit_config, **train_config)
-
File
"/home/xxr2019/NVlabs_noise2noise/dnnlib/submission/submit.py", line 296,
in submit_run
-
run_wrapper(submit_config)
-
File
"/home/xxr2019/NVlabs_noise2noise/dnnlib/submission/submit.py", line 249,
in run_wrapper
-
util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs)
-
File
"/home/xxr2019/NVlabs_noise2noise/dnnlib/util.py", line 232,
in call_func_by_name
-
return func_obj(*args, **kwargs)
-
File
"/home/xxr2019/NVlabs_noise2noise/train.py", line 76,
in train
-
tfutil.init_tf(config.tf_config)
-
File
"/home/xxr2019/NVlabs_noise2noise/dnnlib/tflib/tfutil.py", line 77,
in init_tf
-
create_session(config_dict, force_as_default=True)
-
File
"/home/xxr2019/NVlabs_noise2noise/dnnlib/tflib/tfutil.py", line 100,
in create_session
-
session = tf.Session(config=config)
-
File
"/home/jcx/.conda/envs/n2n/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1551,
in __init__
-
super(Session, self).__init__(target, graph, config=config)
-
File
"/home/jcx/.conda/envs/n2n/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 676,
in __init__
-
self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
-
tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient
for CUDA runtime version
驱动版本即为cuda driver version
输入nvidia-smi,看到我们服务器上的为:
NVIDIA-SMI 375.26 Driver Version: 375.26
输入cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 375.26 Thu Dec 8 18:36:43 PST 2016
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
运行版本即为cuda runtime version,是在python中安装的cudatoolkit和cudnn程序包的版本
-
(n2n) jcx@smart-dsp:~/Desktop/xxr2019/NVlabs_noise2noise$ cat /usr/
local/cuda/version.txt
-
CUDA Version 8.0.61
-
(n2n) jcx@smart-dsp:~/Desktop/xxr2019/NVlabs_noise2noise$ cat /usr/
local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
-
#define CUDNN_MAJOR 6
-
#define CUDNN_MINOR 0
-
#define CUDNN_PATCHLEVEL 21
-
--
-
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
-
-
#include "driver_types.h"
从该图来看,基础环境之中对应关系是没有问题的。
输入pip list即可看到相应的CUDA,但是在此指令之中,没有看到相应的CUDA版本,可能当前版本中CUDA未安装。
或者输入conda list,看到我们的版本为:我们版本为当前最新版本CUDA,因此需要更新驱动到最新版本。
cudatoolkit 9.2 0
cudnn 7.3.1 cuda9.2_0
我们看出是CUDA版本过于新,驱动版本不够新,因此我们安装旧版本的CUDA运行版本,
安装cuda:conda install cudatoolkit=8.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/linux-64/
-
Downloading and Extracting Packages
-
cudatoolkit-8.0 | 322.4 MB |
####################################################################### | 100%
-
Preparing transaction:
done
-
Verifying transaction:
done
-
Executing transaction:
done
安装cudnn:conda install cudnn=7.0.5 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/
-
Preparing transaction:
done
-
Verifying transaction:
done
-
Executing transaction:
done
输入conda list,看到相应的版本变回与驱动对应的版本
cudatoolkit 8.0 3 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
cudnn 7.0.5 cuda8.0_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/ma
一般情况下,版本变动也需要重新安装tensorflow。以免程序报错。
conda install tensorflow-gpu