已解决 Tensorflow 2.0 Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

以下是报错

2020-04-27 21:47:49.479312: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-27 21:47:50.732238: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-04-27 21:47:50.733065: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Traceback (most recent call last):
File “”, line 1, in
File “D:\Tool\Software\JetBrain\Pycharm\PyCharm 2019.2\plugins\python\helpers\pydev_pydev_bundle\pydev_umd.py”, line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File “D:\Tool\Software\JetBrain\Pycharm\PyCharm 2019.2\plugins\python\helpers\pydev_pydev_imps_pydev_execfile.py”, line 18, in execfile
exec(compile(contents+"\n", file, ‘exec’), glob, loc)
File “D:/WorkSpace/Python/Taobao-live-product-identification/Code/train.py”, line 330, in
training()
File “D:/WorkSpace/Python/Taobao-live-product-identification/Code/train.py”, line 231, in training
[validation_image_batch, validation_labels_batch, validation_bbox_batch])
File “D:/WorkSpace/Python/Taobao-live-product-identification/Code/train.py”, line 149, in network_learn_validation
logits, bbox, global_pool = self.basic_resnet_model(input_image)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py”, line 822, in call
outputs = self.call(cast_inputs, *args, **kwargs)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\keras\engine\network.py”, line 717, in call
convert_kwargs_to_constants=base_layer_utils.call_context().saving)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\keras\engine\network.py”, line 891, in _run_internal_graph
output_tensors = layer(computed_tensors, **kwargs)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py”, line 822, in call
outputs = self.call(cast_inputs, *args, **kwargs)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\keras\layers\convolutional.py”, line 209, in call
outputs = self._convolution_op(inputs, self.kernel)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\ops\nn_ops.py”, line 1135, in _call_
return self.conv_op(inp, filter)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\ops\nn_ops.py”, line 640, in _call_
return self.call(inp, filter)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\ops\nn_ops.py”, line 239, in _call_
name=self.name)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\ops\nn_ops.py”, line 2011, in conv2d
name=name)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py”, line 933, in conv2d
data_format=data_format, dilations=dilations, name=name, ctx=_ctx)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py”, line 1022, in conv2d_eager_fallback
ctx=ctx, name=name)
File “D:\Tool\BasicSoftware\Anaconda3\envs\tf-2\lib\site-packages\tensorflow_core\python\eager\execute.py”, line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File “”, line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

之前找了半天没有准确答案,无非就是什么版本不对之类的,直到我看见
Tensorflow issue 里面有人说 config.gpu_options.allow_growth = True 的,但是这个处理是Tensorflow 1.x的。

于是去官网找效果与之相似的代码

解决办法:

import tensorflow as tf
tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True)
# 下面是你的代码

你的代码要在tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True) 之后,不然会报错

你可能感兴趣的:(tensorflow,深度学习,神经网络)