本人电脑win10系统,早前安装CUDA10.0,未安装cudnn。由于本人在Ananconda下创建过TensorFlow1.12,pytorch, 所用的不是一致的CUDA以及cudnn版本。在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0下安装cudNN后个别环境跑程序时出错,故先前未安装。我的理解是在虚拟环境中安装cudatoolkit 和cudnn,运行程序时不会调用C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0下的CUDA. 可能理解不全面,本人自学的,出错可以留言讨论。
然而现在安装TensorFlow2.0以及TensorFlow2.1 GPU版后出现各类问题。此篇解决Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.0 问题。 为了检验错误,本人在Anaconda中分别创建了3个 虚拟环境。有conda安装TensorFlow2.1(链接),pip安装TensorFlow2.1,conda安装TensorFlow2.0.
区别在于conda安装会同时安装cudatoolkit 和cudnn,详见本人TensorFlow2.1的安装过程(链接)。但是不管conda安装还是pip安装(即是否安装了cudatoolkit 和cudnn)出现相同错误。
百度搜索结果有人提出是cudnn问题,提供正确安装cudnn link,但系统不同,本人采样win10的安装方法,即下载cudnn解压后的cuda文件下bin,include,lib三个文件夹复制替换到**/Library,即将官网下载的cudnn替换虚拟环境cudnn如下:
tf2.0安装版本如下
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
出错:
Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.0.
失败的过程如下:
尝试1
官网下载cudnn 7.5.0(NVIDIA官网cudnn下载地址)替换
出错:Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.0.
尝试2
下载cudnn 7.6.0替换
出错:Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.0.
尝试3
下载cudnn 7.6.4替换
出错:Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.0.
尝试4
下载cudnn 7.6.5替换
出错:Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.0.
各种尝试都失败,再次思考百度搜索结果中提出是cudnn问题,正确安装cudnn link,可能cudnn还是需要安装在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0下。将下载的cudnn如下,复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0。
问题解决,程序可以正常运行。
本人一直在想,之前安装的cudnn中未有7.5.0版本,为什么不管TensorFlow2.1还是TensorFlow2.0出错都会提到 7.5.0 ?
在TensorFlow2.0下运行出错如下,
Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.0
在TensorFlow2.1下运行出错如下,
Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.5
TensorFlow2.1虚拟环境下安装版本如下:
cudatoolkit 10.1.243 h74a9793_0
cudnn 7.6.5 cuda10.1_0
tensorboard 2.1.0 py3_0
tensorflow 2.1.0 gpu_py37h7db9008_0
tensorflow-base 2.1.0 gpu_py37h55f5790_0
tensorflow-estimator 2.1.0 pyhd54b08b_0
tensorflow-gpu 2.1.0 h0d30ee6_0
'''
2-5
假设我们正在建立一个系统,用于按优先级对定制票进行排序并将其分配到正确的部门。
模型将具有3个输入:两个输出
'''
from __future__ import absolute_import, division, print_function
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import numpy as np
tf.keras.backend.clear_session()
# 构建一个根据定制票标题、内容和标签,预测票证优先级和执行部门的网络
# 超参
num_words = 2000
num_tags = 12
num_departments = 4
# 输入
body_input = keras.Input(shape=(None,), name='body') # 票证的文本正文(文本输入)
title_input = keras.Input(shape=(None,), name='title') # 票证标题(文本输入)
tag_input = keras.Input(shape=(num_tags,), name='tag') # 用户添加的所有标签(分类输入)
# 嵌入层
body_feat = layers.Embedding(num_words, 64)(body_input)
title_feat = layers.Embedding(num_words, 64)(title_input)
# 特征提取层
body_feat = layers.LSTM(32)(body_feat)
title_feat = layers.LSTM(128)(title_feat)
features = layers.concatenate([title_feat,body_feat, tag_input])
# 分类层
priority_pred = layers.Dense(1, activation='sigmoid', name='priority')(features)
department_pred = layers.Dense(num_departments, activation='softmax', name='department')(features)
# 构建模型
model = keras.Model(inputs=[body_input, title_input, tag_input],
outputs=[priority_pred, department_pred])
model.summary()
keras.utils.plot_model(model, 'PNG\multi_model.png', show_shapes=True)
# 构造数据并训练
# 载入输入数据
title_data = np.random.randint(num_words, size=(1280, 10))
body_data = np.random.randint(num_words, size=(1280, 100))
tag_data = np.random.randint(2, size=(1280, num_tags)).astype('float32')
# 标签
priority_label = np.random.random(size=(1280, 1))
department_label = np.random.randint(2, size=(1280, num_departments))
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
loss={'priority': 'binary_crossentropy',
'department': 'categorical_crossentropy'},
loss_weights=[1., 0.2])
# 训练
history = model.fit(
{'title': title_data, 'body': body_data, 'tag': tag_data},
{'priority': priority_label, 'department': department_label},
batch_size=32,
epochs=5,
verbose=2
)
错误显示如下:
2020-04-05 22:06:55.035546: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8685 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:65:00.0, compute capability: 7.5)
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
title (InputLayer) [(None, None)] 0
__________________________________________________________________________________________________
body (InputLayer) [(None, None)] 0
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, None, 64) 128000 title[0][0]
__________________________________________________________________________________________________
embedding (Embedding) (None, None, 64) 128000 body[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 128) 98816 embedding_1[0][0]
__________________________________________________________________________________________________
lstm (LSTM) (None, 32) 12416 embedding[0][0]
__________________________________________________________________________________________________
tag (InputLayer) [(None, 12)] 0
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 172) 0 lstm_1[0][0]
lstm[0][0]
tag[0][0]
__________________________________________________________________________________________________
priority (Dense) (None, 1) 173 concatenate[0][0]
__________________________________________________________________________________________________
department (Dense) (None, 4) 692 concatenate[0][0]
==================================================================================================
Total params: 368,097
Trainable params: 368,097
Non-trainable params: 0
__________________________________________________________________________________________________
Train on 1280 samples
Epoch 1/5
2020-04-05 22:06:58.449991: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-05 22:06:58.790516: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
32/1280 [..............................] - ETA: 2:382020-04-05 22:07:00.145234: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.5. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2020-04-05 22:07:00.146402: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at cudnn_rnn_ops.cc:1510 : Unknown: Fail to find the dnn implementation.
2020-04-05 22:07:00.146594: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
2020-04-05 22:07:00.146865: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: {{function_node __forward_cudnn_lstm_with_fallback_3921_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545}} {{function_node __forward_cudnn_lstm_with_fallback_3921_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545}} Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
[[model/lstm_1/StatefulPartitionedCall]]
[[Reshape_13/_38]]
2020-04-05 22:07:00.147591: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: {{function_node __forward_cudnn_lstm_with_fallback_3921_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545}} {{function_node __forward_cudnn_lstm_with_fallback_3921_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545_specialized_for_model_lstm_1_StatefulPartitionedCall_at___inference_distributed_function_5545}} Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
[[model/lstm_1/StatefulPartitionedCall]]
2020-04-05 22:07:00.147594: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.5.0 but source was compiled with: 7.6.5. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2020-04-05 22:07:00.149645: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at cudnn_rnn_ops.cc:1510 : Unknown: Fail to find the dnn implementation.
2020-04-05 22:07:00.149836: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
Traceback (most recent call last):
File "D:/ProjectWork/Tensorflow2.0Work/Test01/test2-5.py", line 63, in <module>
epochs=5
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 819, in fit
use_multiprocessing=use_multiprocessing)
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 342, in fit
total_epochs=epochs)
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 128, in run_one_epoch
batch_outs = execution_function(iterator)
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 98, in execution_function
distributed_function(input_fn))
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 568, in __call__
result = self._call(*args, **kwds)
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 632, in _call
return self._stateless_fn(*args, **kwds)
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\eager\function.py", line 2363, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\eager\function.py", line 1611, in _filtered_call
self.captured_inputs)
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\eager\function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\eager\function.py", line 545, in call
ctx=ctx)
File "C:\Users\OFC\Anaconda3\envs\tf21\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "" , line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError: [_Derived_] Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
[[model/lstm_1/StatefulPartitionedCall]]
[[Reshape_13/_38]] [Op:__inference_distributed_function_5545]
Function call stack:
distributed_function -> distributed_function -> distributed_function
Process finished with exit code 1
C:\Users\OFC\Anaconda3\envs\tf2.0\python.exe D:/ProjectWork/Tensorflow2.0Work/Test01/test2-5.py
2020-04-07 16:35:28.734070: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-04-07 16:35:30.476916: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-04-07 16:35:30.574087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.56
pciBusID: 0000:65:00.0
2020-04-07 16:35:30.574296: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-04-07 16:35:30.574989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-04-07 16:35:30.576608: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-04-07 16:35:30.579386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.56
pciBusID: 0000:65:00.0
2020-04-07 16:35:30.579582: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-04-07 16:35:30.580049: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-04-07 16:35:35.441919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-07 16:35:35.442098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-04-07 16:35:35.442202: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-04-07 16:35:35.449190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8686 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:65:00.0, compute capability: 7.5)
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
title (InputLayer) [(None, None)] 0
__________________________________________________________________________________________________
body (InputLayer) [(None, None)] 0
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, None, 64) 128000 title[0][0]
__________________________________________________________________________________________________
embedding (Embedding) (None, None, 64) 128000 body[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 128) 98816 embedding_1[0][0]
__________________________________________________________________________________________________
lstm (LSTM) (None, 32) 12416 embedding[0][0]
__________________________________________________________________________________________________
tag (InputLayer) [(None, 12)] 0
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 172) 0 lstm_1[0][0]
lstm[0][0]
tag[0][0]
__________________________________________________________________________________________________
priority (Dense) (None, 1) 173 concatenate[0][0]
__________________________________________________________________________________________________
department (Dense) (None, 4) 692 concatenate[0][0]
==================================================================================================
Total params: 368,097
Trainable params: 368,097
Non-trainable params: 0
__________________________________________________________________________________________________
Train on 1280 samples
Epoch 1/5
2020-04-07 16:35:40.258810: W tensorflow/core/grappler/optimizers/implementation_selector.cc:310] Skipping optimization due to error while loading function libraries: Invalid argument: Functions '__inference___backward_standard_lstm_5208_5679' and '__inference___backward_cudnn_lstm_with_fallback_4305_4485_specialized_for_StatefulPartitionedCall_at___inference_distributed_function_6478' both implement 'lstm_5df00234-27a5-40dc-91b1-c1e623edbf3b' but their signatures do not match.
2020-04-07 16:35:40.483931: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-04-07 16:35:40.861842: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
1280/1280 - 6s - loss: 1.2906 - priority_loss: 0.7043 - department_loss: 2.9314
Epoch 2/5
1280/1280 - 1s - loss: 1.2722 - priority_loss: 0.7030 - department_loss: 2.8464
Epoch 3/5
1280/1280 - 1s - loss: 1.2627 - priority_loss: 0.7017 - department_loss: 2.8048
Epoch 4/5
1280/1280 - 1s - loss: 1.2570 - priority_loss: 0.7005 - department_loss: 2.7825
Epoch 5/5
1280/1280 - 1s - loss: 1.2536 - priority_loss: 0.6997 - department_loss: 2.7694
Process finished with exit code 0