报错:Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll not found Ignore abo

报错:Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll not found Ignore above

tensorflow使用过程中出现问题(第一次安装使用)

问题背景

RTX4070显卡,英伟达驱动的cuda版本12.3,下载了CUDA Toolkit12.3和对应的cudnn8.9,因为想着用新版本的开跑代码:
报错:Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll not found Ignore abo_第1张图片
在使用pycharm运行如下的tensorflow-gpu测试代码:

import tensorflow as tf

gpu_device_name = tf.test.gpu_device_name()
print(gpu_device_name)

print(tf.test.is_gpu_available())

# 列出所有的本地机器设备
local_device_protos = tf.config.list_physical_devices('GPU')
# 打印
print(local_device_protos)

得到报错:

Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll not found Ignore abo

问题分析

其根本原因是CUDA TOOLKIT的安装问题,首先确认安装目录究竟有没有这个dll文件,从自己的安装地址去找(例如:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin):
报错:Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll not found Ignore abo_第2张图片
此处有两种情况,如果找到了则是环境变量有问题,看看有没有添加NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin的系统变量;如果没有找到,就是CUDA TOOLKIT的安装版本有问题。参考我的实测正确的系统环境变量的设置:
报错:Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll not found Ignore abo_第3张图片

我这边的问题是安装版本的问题,虽然英伟达的驱动的CUDA版本是12.3,但是tensorflow-gpu.10.0支持的最高的CUDA TOOLKIT的版本是11.2 ,见对应关系官网:
报错:Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll not found Ignore abo_第4张图片
此外,从tensorflow-gpu.10.0以上的更高版本,在windows版本已经不区分gpu版本和cpu版本了,所以暂时推荐使用tensorflow-gpu.10.0以防各种麻烦。

问题解决

虽然是RTX4070显卡,虽然英伟达驱动的cuda版本12.3,但是如果要用tensorflow-gpu必须安装老版本的CUDA TOOLKIT(11.2)和cudnn(8.1),不然没法跑。Pytorch相比没有这种问题。

运行之前的测试代码,已经检测到GPU 4070了:

2023-11-21 11:10:22.602796: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/device:GPU:0
True
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2023-11-21 11:10:23.064863: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /device:GPU:0 with 9458 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4070, pci bus id: 0000:01:00.0, compute capability: 8.9
WARNING:tensorflow:From C:\Users\24762\PycharmProjects\pythonProject2\main.py:6: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2023-11-21 11:10:23.065778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /device:GPU:0 with 9458 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4070, pci bus id: 0000:01:00.0, compute capability: 8.9

Process finished with exit code 0

你可能感兴趣的:(人工智能,网络,python)