2022-01-21 查看显卡算力

背景

笔者GeForce RT 710显卡,装的是792版本驱动,查看GPU信息,提示支持到CUDA11.4,遂装了11.4版本的CUDA,

首先验证CUDA环境是安装成功的

但是在安装paddle后,执行paddle验证函数时,提示错误

import paddle
paddle.utils.run_check()
Running verify PaddlePaddle program ...
W0121 17:01:03.469723 11728 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 3.5, Driver API Version: 11.4, Runtime API Version: 11.2
W0121 17:01:03.770958 11728 device_context.cc:465] device: 0, cuDNN Version: 8.2.
W0121 17:01:08.866175 11728 operator.cc:248] uniform_random raises an exception class thrust::system::system_error, parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "", line 1, in 
  File "D:\Program Files\Python\Python39\lib\site-packages\paddle\utils\install_check.py", line 196, in run_check
    _run_static_single(use_cuda)
  File "D:\Program Files\Python\Python39\lib\site-packages\paddle\utils\install_check.py", line 124, in _run_static_single
    exe.run(startup_prog)
  File "D:\Program Files\Python\Python39\lib\site-packages\paddle\fluid\executor.py", line 1262, in run
    six.reraise(*sys.exc_info())
  File "D:\Program Files\Python\Python39\lib\site-packages\six.py", line 719, in reraise
    raise value
  File "D:\Program Files\Python\Python39\lib\site-packages\paddle\fluid\executor.py", line 1250, in run
    return self._run_impl(
  File "D:\Program Files\Python\Python39\lib\site-packages\paddle\fluid\executor.py", line 1394, in _run_impl
    return self._run_program(
  File "D:\Program Files\Python\Python39\lib\site-packages\paddle\fluid\executor.py", line 1491, in _run_program
    self._default_executor.run(program.desc, scope, 0, True, True,
RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device

查资料说是显卡算力不足。

在NVIDIA网站查看显卡对应的算力表:https://developer.nvidia.com/zh-cn/cuda-gpus#compute

也可以在执行Python脚本获取显卡算力信息

import os

from tensorflow.python.client import device_lib

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "99"

if __name__ == "__main__":
    print(device_lib.list_local_devices())

import torch
from torch.utils.data import Dataset, DataLoader
print(torch.cuda.is_available())
import torch.nn as nn

input_size = 5
output_size = 2

batch_size = 30
data_size = 100

device = torch.device("cuda: 0" if torch.cuda.is_available() else "cpu")

class RandomDataset(Dataset):

    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len

rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size),
                         batch_size=batch_size, shuffle=True)

class Model(nn.Module):
    # Our model

    def __init__(self, input_size, output_size):
        super(Model, self).__init__()
        self.fc = nn.Linear(input_size, output_size)

    def forward(self, input):
        output = self.fc(input)
        print("\tIn Model: input size", input.size(),
              "output size", output.size())

        return output
model = Model(input_size, output_size)
if torch.cuda.device_count() >= 1:
  print("Let's use", torch.cuda.device_count(), "GPUs!")
  # dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs
  model = nn.DataParallel(model)

model.to(device)

for data in rand_loader:
    input = data.to(device)
    output = model(input)
    print("Outside: input size", input.size(),
          "output_size", output.size())

你可能感兴趣的:(python,cuda,深度学习,gpu,tensorflow)