TensorFlow2.1 GPU版 win10环境下配置及测试

此文针对TensorFlow GPU版 win10安装配置的,CPU版本安装比较简单,如下:

CPU版本TensorFlow安装

tensorflow-CPU版本安装非常简单直接在终端输入命令
pip install tensorflow
或者安装特定的版本
pip install tensorflow==2.1
注:不注明版本默认是CPU版本,但好像最新版默认GPUCPU都有


如果你有GPU的话,继续往下看------------------------------------------------------------------

GPU版本TensorFlow安装及配置比较复杂,具体如下:

GPU版本TensorFlow安装及配置

所需环境安装以及安装顺序


- Python 3.7.6(Anaconda)
- tensorflow-gpu==2.1
- Cuda 10.1(update2)(10.2和TensorFlow2.1不匹配)
- Cudnn 7.6(for CUDA 10.1)

作者已经装好,可以先让我们开始运行一下看看效果:

首先放出代码
#!/usr/bin/env python
# -*- encoding: utf-8 -*-

@File         :   pt2.py
@Time         :   2020/05/21 23:58:57
@Author       :   艾强云
@Contact      :   [email protected]
@Department   :   SCAU 
@Desc         :   None

#机器学习神经网络
# here put the import lib
import tensorflow as tf 
from tensorflow import keras

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
mnist = keras.datasets.fashion_mnist
(X_train, y_train),(X_test,y_test) = mnist.load_data()

print("训练数据形状," , X_train.shape)
print("数据最大值 " , np.max(X_train))
print("查看标签数值 " , y_train)

class_names =['top','trouser','pullover','dress','coat','sandal','shirt','sneaker','bag','ankle boot']#定义10个类别的名称

plt.figure()#可视化
plt.imshow(X_train[1])#【】里面的数据可以自己输入随便一个画出第几个的图
plt.colorbar()#加一个颜色条
plt.show()

#将数据集归一化 即降低数据集的值
X_train = X_train/255.0
X_test = X_test/255.0
plt.figure()#可视化
plt.imshow(X_train[1])#【】里面的数据可以自己输入随便一个画出第几个的图
plt.colorbar()#加一个颜色条
plt.show()

#可以看出值被缩放到0到1之间
from tensorflow.python.keras.models import Sequential #导入训练模型
from tensorflow.python.keras.layers import Flatten,Dense#导入神经网络的第一层和第二层


model = Sequential()
model.add(Flatten(input_shape = (28,28)))#此行代码是将图的大小数据转换成一维的数据
model.add(Dense(128,activation = 'relu'))#定义第一层神经网络有128个单元,并且选择的激活函数是ReLu函数,也可以是其他函数性sigmoid函数
# 这里要是不懂可以查看吴恩达老师深度学习的3.6节课
model.add(Dense(10,activation = 'softmax'))#定义输出层,有10类所以输出10,激活函数是max函数

print("查看自己写的代码的总体参数 " , model.summary())#查看自己写的代码的总体参数


#模型补充
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])#定义损失函数

#使用的优化器名叫AdamOptimizer,使用的损失函数是稀疏分类交叉熵
model.fit(X_train,y_train,epochs = 10)#进行训练,epochs是显示运行多少次

test_loss, test_acc = model.evaluate(X_test,y_test)#利用测试集测试训练下的模型的准确度
print(test_acc)

#预测模型精确度
from sklearn.metrics import accuracy_score
y_pred = model.predict_classes(X_test)

print(accuracy_score(y_test, y_pred))

print(tf.test.is_gpu_available())

GPU运行成功:具体如下

PS F:\vscode-python-kiton> & D:/ruanjian/anaconda202002/python.exe f:/vscode-python-kiton/数学/TensorFlow/pt2.py
2020-05-29 00:33:44.308803: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
训练数据形状, (60000, 28, 28)
数据最大值  255
查看标签数值  [9 0 0 ... 3 0 5]
2020-05-29 00:33:50.532487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-29 00:33:50.557648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-05-29 00:33:50.561284: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-29 00:33:50.568473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-29 00:33:50.573147: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-29 00:33:50.575965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-29 00:33:50.581990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-29 00:33:50.585757: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-29 00:33:50.593746: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-29 00:33:50.596544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-29 00:33:50.598143: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-05-29 00:33:50.601011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-05-29 00:33:50.605562: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-29 00:33:50.608013: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-29 00:33:50.609933: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-29 00:33:50.612253: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-29 00:33:50.614185: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-29 00:33:50.616119: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-29 00:33:50.617990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-29 00:33:50.619995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-29 00:33:51.071264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-29 00:33:51.073211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-05-29 00:33:51.074427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-05-29 00:33:51.075876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
flatten (Flatten)            (None, 784)               0
_________________________________________________________________
dense (Dense)                (None, 128)               100480
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________
查看自己写的代码的总体参数  None
Train on 60000 samples
Epoch 1/10
2020-05-29 00:33:51.467932: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
60000/60000 [==============================] - 2s 41us/sample - loss: 0.5001 - accuracy: 0.8269
Epoch 2/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.3769 - accuracy: 0.8647
Epoch 3/10
60000/60000 [==============================] - 2s 34us/sample - loss: 0.3376 - accuracy: 0.8768
Epoch 4/10
60000/60000 [==============================] - 2s 34us/sample - loss: 0.3126 - accuracy: 0.8848
Epoch 5/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2953 - accuracy: 0.8902
Epoch 6/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2818 - accuracy: 0.8956
Epoch 7/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2693 - accuracy: 0.9008
Epoch 8/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2591 - accuracy: 0.9031
Epoch 9/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2496 - accuracy: 0.9071
Epoch 10/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2408 - accuracy: 0.9107
10000/10000 [==============================] - 0s 33us/sample - loss: 0.3349 - accuracy: 0.8823
0.8823
0.8823
WARNING:tensorflow:From f:/vscode-python-kiton/数学/TensorFlow/pt2.py:70: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-05-29 00:34:12.313519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-05-29 00:34:12.318078: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-29 00:34:12.319828: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-29 00:34:12.322225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-29 00:34:12.324215: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-29 00:34:12.325950: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-29 00:34:12.327717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-29 00:34:12.329492: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-29 00:34:12.332276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-29 00:34:12.333684: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-29 00:34:12.335506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-05-29 00:34:12.336623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-05-29 00:34:12.337972: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 4604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
True


成功采用GPU运行计算,接下来具体描述安装流程

安装配置流程

1. 首先下载安装 Anaconda(开源的Python发行版本,最新版为3.7, 大小为466MB)

安装完后会自动添加相关路径到PATH环境变量,可以直接在终端cmd或者power shell界面输入python查看是否安装好。安装方法参考

C:\Users\Administrator>python
Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32

2. 更新pip到最新版本(版本需要大于20.0)

在终端cmd或者power shell界面直接输入如下命令:
python -m pip install --upgrade pip
然后查看pip版本,终端输入:
pip --version

C:\Users\Administrator>pip --version
pip 20.2b1 from D:\ruanjian\anaconda202002\lib\site-packages\pip-20.2b1-py3.7.egg\pip (python 3.7)

3. 安装TensorFlow-GPU版本,这里选用2.1版本(2.2GPU版本有兼容问题)

pip install tensorflow-gpu==2.1
耐心等待下载安装完(大概300+MB),在终端进入python环境后查看tensorflow是否安装好(我这是全部配置好后的情况),版本号以及安装路径。逐个输入下方命令

python
import tensorflow as tf
tf.__version__
tf.__path__

结果表明安装完毕


>>> import tensorflow as tf
2020-05-29 13:49:55.894087: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
>>> tf.__version__
'2.1.0'
>>> tf.__path__
['C:\\Users\\Administrator\\AppData\\Roaming\\Python\\Python37\\site-packages\\tensorflow']
>>>

4. 下载安装对应的CUDA版本,这里选用CUDA10.1

选择安装Windows x64 10 local, 然后点击下载(Download 2.5GB ),然后下载完直接点击安装好就行
CUDA10.1
查看安装情况,在终端输入如下命令:

deviceQuery

C:\Users\Administrator>deviceQuery
deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce RTX 2060"
  CUDA Driver Version / Runtime Version          10.2 / 10.1
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 6144 MBytes (6442450944 bytes)
  (30) Multiprocessors, ( 64) CUDA Cores/MP:     1920 CUDA Cores
  GPU Max Clock rate:                            1755 MHz (1.75 GHz)
  Memory Clock rate:                             7001 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 3145728 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               zu bytes
  Total amount of shared memory per block:       zu bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          zu bytes
  Texture alignment:                             zu bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.1, NumDevs = 1, Device0 = GeForce RTX 2060
Result = PASS

也可以输入命令nvcc -V查看

nvcc -V

C:\Users\Administrator>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.1, V10.1.243

5. 下载cuDNN, 这里选用cuDNN 7.6版本

cuDNN作为cuda的补充,安装比较简单多了,只需要把下载后的压缩文件解压缩然后 复制过去就行,具体步骤如下:
下载红色框cuDNN7.6.4 for CUDA 10.1版本
cuDNN7.6.4 for CUDA 10.1
再选择win10版本
cuDNN Library for Windows 10
下载完后,解压后将/bin, /include 和 /lib 三个文件夹都复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1文件夹下,会自动合并文件
cuDNN解压

将cuDNN文件复制到CUDA10.1下

6. 环境变量设置-PATH路径添加

将CUDA各个PATH路径添加好,否则有可能出问题,系统环境变量PATH需添加的路径如下:
系统环境变量PATH需添加的路径

至此,TensorFlow GPU版 win10 环境配置已然完成!


测试

1. 查看GPU情况

使用NVSMI命令查看驱动版本,CUDA版本等信息

nvidia-smi

C:\Users\Administrator>nvidia-smi
Fri May 29 15:04:54 2020
+-----------------------------------------------------------------------------+                                                                                                                                                              
| NVIDIA-SMI 441.22       Driver Version: 441.22       CUDA Version: 10.2     |                                                                                                                                                              
|-------------------------------+----------------------+----------------------+                                                                                                                                                              
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |                                                                                                                                                              
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |                                                                                                                                                              
|===============================+======================+======================|                                                                                                                                                              
|   0  GeForce RTX 2060   WDDM  | 00000000:01:00.0  On |                  N/A |                                                                                                                                                              
|  0%   43C    P8     7W / 175W |    880MiB /  6144MiB |      2%      Default |                                                                                                                                                              
+-------------------------------+----------------------+----------------------+                                                                                                                                                              
                                                                                                                                                                                                                                             
+-----------------------------------------------------------------------------+                                                                                                                                                              
| Processes:                                                       GPU Memory |                                                                                                                                                              
|  GPU       PID   Type   Process name                             Usage      |                                                                                                                                                              
|=============================================================================|                                                                                                                                                              
|    0      1164    C+G   Insufficient Permissions                   N/A      |                                                                                                                                                              
|    0      4296    C+G   C:\Windows\explorer.exe                    N/A      |                                                                                                                                                              
|    0      4504    C+G   ...al\Google\Chrome\Application\chrome.exe N/A      |                                                                                                                                                              
|    0      5052    C+G   ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A      |                                                                                                                                                              
|    0      5204    C+G   ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A      |                                                                                                                                                              
|    0      6440    C+G   ...hell.Experiences.TextInput.InputApp.exe N/A      |                                                                                                                                                              
|    0     12332    C+G   ...rogram Files\Microsoft VS Code\Code.exe N/A      |                                                                                                                                                              
|    0     14236    C+G   ...oftEdge_8wekyb3d8bbwe\MicrosoftEdge.exe N/A      |                                                                                                                                                              
|    0     15860    C+G   ...rosoft Office\root\Office16\WINWORD.EXE N/A      |                                                                                                                                                              
+-----------------------------------------------------------------------------+        

2. 比较GPU 和CPU的速度--tensorflow中测试cpu和gpu的速度差距

具体代码 就不粘贴了,结果如下:
******************************************************
1500次比对
******************************************************
----------------------
GPU
-----------------------
Shape: (1500, 1500) Device: /gpu:0
Time taken: 0:00:00.958767
------------------------------
CPU
---------------------------
Shape: (1500, 1500) Device: /cpu:0
Time taken: 0:00:00.601363

******************************************************
15000次比对
******************************************************
----------------------
GPU
-----------------------
Shape: (15000, 15000) Device: /gpu:0
Time taken: 0:00:02.584088
------------------------------
CPU
---------------------------
Shape: (15000, 15000) Device: /cpu:0
Time taken: 0:00:13.458996



******************************************************
20000次比对
******************************************************
------------------------------
GPU
---------------------------
1999980200000.0

Shape: (20000, 20000) Device: /gpu:0
Time taken: 0:00:05.113321
----------------------
CPU
-----------------------

2000095700000.0

Shape: (20000, 20000) Device: /cpu:0
Time taken: 0:00:32.852118
从运行时间来看,在训练规模较小时,CPU还可能更快,在规模较大时,GPU优势明显。因此如果我们的训练数据集较小时可以不用调用GPU运算,而只用CPU运行,可以在导入TensorFlow前加入如下python代码:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1' #不用GPU 使用CPU

最后感谢大家的阅读,让我们一起开始深度学习之旅吧!

参考文章:

Anaconda的安装教程

windows下安装配置cudn和cudnn

CUDA与cuDNN

走进tensorflow第十二步——测试cpu和gpu的速度差距

你可能感兴趣的:(TensorFlow2.1 GPU版 win10环境下配置及测试)