环境背景:带有2060 Max Q的电脑,安装好conda。
.condarc中的conda源如下:
channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
- defaults
show_channel_urls: true
default_channels:
- https://mirrors.bfsu.edu.cn/anaconda/pkgs/main
- https://mirrors.bfsu.edu.cn/anaconda/pkgs/r
- https://mirrors.bfsu.edu.cn/anaconda/pkgs/msys2
custom_channels:
conda-forge: https://mirrors.bfsu.edu.cn/anaconda/cloud
msys2: https://mirrors.bfsu.edu.cn/anaconda/cloud
bioconda: https://mirrors.bfsu.edu.cn/anaconda/cloud
menpo: https://mirrors.bfsu.edu.cn/anaconda/cloud
pytorch: https://mirrors.bfsu.edu.cn/anaconda/cloud
pytorch-lts: https://mirrors.bfsu.edu.cn/anaconda/cloud
simpleitk: https://mirrors.bfsu.edu.cn/anaconda/cloud
一、引言
安装tensorflow-gpu的方式一般有两种。
(1)在官网下载CUDA ,CUDNN,然后继续下载合适版本的tensorflow进行安装,注意三者和python版本之间的对应关系:
(2)在conda中安装,conda会自动匹配合适版本的CUDA以及CUDNN。
二、方法
选择使用conda的形式安装。
1、进入conda,conda install tensorflow-gpu keras进行安装。需要注意python的版本:
(1)环境python3.6:
python -V
Python 3.6.2 :: Continuum Analytics, Inc.
conda install tensorflow-gpu keras
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: D:\miniconda3\envs\o3
added / updated specs:
- keras
- tensorflow-gpu
The following packages will be downloaded:
package | build
---------------------------|-----------------
cudatoolkit-8.0 | 3 319.9 MB https://mirrors.tuna.t singhua.edu.cn/anaconda/pkgs/free
cudnn-6.0 | 0 95.1 MB https://mirrors.tuna.t singhua.edu.cn/anaconda/pkgs/free
keras-2.6.0 | pyhd3eb1b0_0 721 KB defaults
libprotobuf-3.2.0 | vc14_0 9.1 MB https://mirrors.tuna.t singhua.edu.cn/anaconda/pkgs/free
mkl-2017.0.3 | 0 126.3 MB https://mirrors.tuna.t singhua.edu.cn/anaconda/pkgs/free
protobuf-3.2.0 | py36_0 459 KB https://mirrors.tuna.t singhua.edu.cn/anaconda/pkgs/free
werkzeug-0.12.2 | py36_0 435 KB https://mirrors.tuna.t singhua.edu.cn/anaconda/pkgs/free
zlib-1.2.11 | vc14_0 119 KB https://mirrors.tuna.t singhua.edu.cn/anaconda/pkgs/free
------------------------------------------------------------
Total: 552.1 MB
The following NEW packages will be INSTALLED:
此时默认匹配的tensorflow-gpu版本是1.*;我们想要安装2.*的版本,于是尝试:
source activate o3
✔
(o3)
28/09/2022 12:28.33 /home/mobaxterm conda install tensorflow-gpu=2.0.0 keras
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: D:\miniconda3\envs\o3
added / updated specs:
- keras
- tensorflow-gpu=2.0.0
The following packages will be downloaded:
package | build
---------------------------|-----------------
astor-0.8.1 | py36haa95532_0 47 KB defaults
certifi-2021.5.30 | py36haa95532_0 140 KB defaults
gast-0.2.2 | py36_0 155 KB defaults
grpcio-1.14.1 | py36h5c4b210_0 835 KB defaults
h5py-2.10.0 | py36h5e291fa_0 807 KB defaults
hdf5-1.10.4 | h7ebc959_0 7.9 MB defaults
importlib-metadata-4.8.1 | py36haa95532_0 39 KB defaults
keras-base-2.3.1 | py36_0 486 KB defaults
libprotobuf-3.17.2 | h23ce68f_1 1.9 MB defaults
markdown-3.3.4 | py36haa95532_0 146 KB defaults
mkl-service-2.3.0 | py36h196d8e1_0 45 KB defaults
mkl_fft-1.0.14 | py36h6288b17_0 155 KB defaults
mkl_random-1.0.4 | py36h343c172_0 289 KB defaults
numpy-1.17.0 | py36h19fb1c0_0 25 KB defaults
numpy-base-1.17.0 | py36hc3f5095_0 4.8 MB defaults
protobuf-3.17.2 | py36hd77b12b_0 253 KB defaults
pyreadline-2.1 | py36_0 139 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
python-3.6.13 | h3758d61_0 14.6 MB defaults
pyyaml-3.12 | py36_0 122 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
scipy-1.5.2 | py36h9439919_0 11.9 MB defaults
setuptools-58.0.4 | py36haa95532_0 776 KB defaults
tensorflow-2.0.0 |gpu_py36hfdd5754_0 4 KB defaults
tensorflow-base-2.0.0 |gpu_py36h390e234_0 96.5 MB defaults
termcolor-1.1.0 | py36_0 8 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
typing_extensions-4.1.1 | pyh06a4308_0 28 KB defaults
wrapt-1.12.1 | py36he774522_1 49 KB defaults
zipp-3.6.0 | pyhd3eb1b0_0 17 KB defaults
------------------------------------------------------------
Total: 142.1 MB
The following NEW packages will be INSTALLED:
此时conda会匹配tensorflow-gpu=2.0.0以及相应keras的包,注意tensorflow-gpu和keras一起安装可以减少不必要的bug。
安装之后,import tensorflow as tf
>>> import tensorflow as tf
Traceback (most recent call last):
File "D:\miniconda3\envs\o3\lib\site-packages\numpy\core\__init__.py", line 17, in <module>
from . import multiarray
File "D:\miniconda3\envs\o3\lib\site-packages\numpy\core\multiarray.py", line 14, in <module>
from . import overrides
File "D:\miniconda3\envs\o3\lib\site-packages\numpy\core\overrides.py", line 7, in <module>
from numpy.core._multiarray_umath import (
ImportError: DLL load failed: 找不到指定的模块。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "" , line 1, in <module>
File "D:\miniconda3\envs\o3\lib\site-packages\tensorflow\__init__.py", line 98, in <module>
from tensorflow_core import *
File "D:\miniconda3\envs\o3\lib\site-packages\tensorflow_core\__init__.py", line 40, in <module>
from tensorflow.python.tools import module_util as _module_util
File "D:\miniconda3\envs\o3\lib\site-packages\tensorflow\__init__.py", line 50, in __getattr__
module = self._load()
File "D:\miniconda3\envs\o3\lib\site-packages\tensorflow\__init__.py", line 44, in _load
module = _importlib.import_module(self.__name__)
File "D:\miniconda3\envs\o3\lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "D:\miniconda3\envs\o3\lib\site-packages\tensorflow_core\python\__init__.py", line 47, in <module>
import numpy as np
File "D:\miniconda3\envs\o3\lib\site-packages\numpy\__init__.py", line 142, in <module>
from . import core
File "D:\miniconda3\envs\o3\lib\site-packages\numpy\core\__init__.py", line 47, in <module>
raise ImportError(msg)
ImportError:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy c-extensions failed.
- Try uninstalling and reinstalling numpy.
- If you have already done that, then:
1. Check that you expected to use Python3.6 from "D:\miniconda3\envs\o3\python.exe",
and that you have no directories in your PATH or PYTHONPATH that can
interfere with the Python and numpy version "1.17.0" you're trying to use.
2. If (1) looks fine, you can open a new issue at
https://github.com/numpy/numpy/issues. Please include details on:
- how you installed Python
- how you installed numpy
- your operating system
- whether or not you have multiple versions of Python installed
- if you built from source, your compiler versions and ideally a build log
- If you're working with a numpy git repository, try `git clean -xdf`
(removes all files not under version control) and rebuild numpy.
Note: this error has many possible causes, so please don't comment on
an existing issue about this - open a new one instead.
Original error was: DLL load failed: 找不到指定的模块。
报错:Importing the numpy c-extensions failed.
查看numpy 和setuptools的版本
numpy 1.17.0 py36h19fb1c0_0 defaults
numpy-base 1.17.0 py36hc3f5095_0 defaults
setuptools 58.0.4 py36haa95532_0 defaults
pip卸载numpy并重装
pip uninstall -y numpy
pip uninstall -y setuptools
pip install setuptools
pip install numpy
再次导入,import tensorflow as tf
python
Python 3.6.13 |Anaconda, Inc.| (default, Mar 16 2021, 11:37:27) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2022-09-28 12:50:47.213206: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart6
4_100.dll'; dlerror: cudart64_100.dll not found
报错,提示缺少cudart64_100.dll,搜索问题
在everything中搜索cudart64_100.dll,发现本地存在但是程序没找到:
解决办法1:
在import tensorflow之前加载这个文件:
python
Python 3.6.13 |Anaconda, Inc.| (default, Mar 16 2021, 11:37:27) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes
>>>
>>> hllDll = ctypes.WinDLL("D:\\miniconda3\\pkgs\\cudatoolkit-10.0.130-0\\Library\\bin\\cudart64_100.dll")
>>> import tensorflow as tf
2022-09-28 13:06:42.887916: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cud
art64_100.dll
>>> import keras
Using TensorFlow backend.
>>>
>>> print(tf.test.is_gpu_available())
2022-09-28 13:07:16.274674: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlo
w binary was not compiled to use: AVX AVX2
2022-09-28 13:07:16.277811: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvc
uda.dll
2022-09-28 13:07:16.309960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2022-09-28 13:07:16.310041: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically l
inked, skip dlopen check.
2022-09-28 13:07:16.310273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2022-09-28 13:07:17.079220: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with streng
th 1 edge matrix:
2022-09-28 13:07:17.079331: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2022-09-28 13:07:17.079568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2022-09-28 13:07:17.080022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with
4737 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability:
7.5)
True
解决办法2:
将这个文件所在文件夹的路径添加到环境变量中(尝试的时候在用户变量和系统变量均添加),并重启:
python
Python 3.6.13 |Anaconda, Inc.| (default, Mar 16 2021, 11:37:27) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2022-09-28 13:24:26.777140: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cud
art64_100.dll
>>> print(tf.test.is_gpu_available())
2022-09-28 13:25:47.229141: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlo
w binary was not compiled to use: AVX AVX2
2022-09-28 13:25:47.236312: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvc
uda.dll
2022-09-28 13:25:47.292808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2022-09-28 13:25:47.292901: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically l
inked, skip dlopen check.
2022-09-28 13:25:47.293171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2022-09-28 13:25:48.879379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with streng
th 1 edge matrix:
2022-09-28 13:25:48.879463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2022-09-28 13:25:48.879500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2022-09-28 13:25:48.881238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with
4737 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability:
7.5)
True