环境配置就不多说了,有很多这方面的资料,作者的环境是:
显卡:Nvidia GeForce 920MX (很渣)
CUDA:9.2
IDE:VS2015
OS:WIN10 64bit
首先通过cuda的设备查询接口来获取显卡的相应信息,代码如下:
#include
#include
#include
#include
void main() {
int deviceCount;
cudaGetDeviceCount(&deviceCount);
int dev;
for (dev = 0; dev < deviceCount; dev++)
{
int driver_version(0), runtime_version(0);
cudaDeviceProp deviceProp;
cudaGetDeviceProperties(&deviceProp, dev);
if (dev == 0)
if (deviceProp.minor = 9999 && deviceProp.major == 9999)
printf("\n");
printf("\nDevice%d:\"%s\"\n", dev, deviceProp.name);
cudaDriverGetVersion(&driver_version);
printf("CUDA Driver Version: %d.%d\n",
driver_version / 1000, (driver_version % 1000) / 10);
cudaRuntimeGetVersion(&runtime_version);
printf("CUDA Runtime Version: %d.%d\n",
runtime_version / 1000, (runtime_version % 1000) / 10);
printf("Device Prop: %d.%d\n",
deviceProp.major, deviceProp.minor);
printf("Total amount of Global Memory: %u bytes\n",
deviceProp.totalGlobalMem);
printf("Number of SMs: %d\n",
deviceProp.multiProcessorCount);
printf("Total amount of Constant Memory: %u bytes\n",
deviceProp.totalConstMem);
printf("Total amount of Shared Memory per block: %u bytes\n",
deviceProp.sharedMemPerBlock);
printf("Total number of registers available per block: %d\n",
deviceProp.regsPerBlock);
printf("Warp size: %d\n",
deviceProp.warpSize);
printf("Maximum number of threads per SM: %d\n",
deviceProp.maxThreadsPerMultiProcessor);
printf("Maximum number of threads per block: %d\n",
deviceProp.maxThreadsPerBlock);
printf("Maximum size of each dimension of a block: %d x %d x %d\n",
deviceProp.maxThreadsDim[0], deviceProp.maxThreadsDim[1], deviceProp.maxThreadsDim[2]);
printf("Maximum size of each dimension of a grid: %d x %d x %d\n",
deviceProp.maxGridSize[0], deviceProp.maxGridSize[1], deviceProp.maxGridSize[2]);
printf("Maximum memory pitch: %u bytes\n",
deviceProp.memPitch);
printf("Texture alignmemt: %u bytes\n",
deviceProp.texturePitchAlignment);
printf("Clock rate: %.2f GHz\n",
deviceProp.clockRate * 1e-6f);
printf("Memory Clock rate: %.0f MHz\n",
deviceProp.memoryClockRate * 1e-3f);
printf("Memory Bus Width: %d-bit\n",
deviceProp.memoryBusWidth);
}
system("pause");
}
编译时请记得切换到x64:
运行结果如下:
Device0:"GeForce 920MX"
CUDA Driver Version: 11.0
CUDA Runtime Version: 9.2
Device Prop: 5.0
Total amount of Global Memory: 2147483648 bytes
Number of SMs: 2
Total amount of Constant Memory: 65536 bytes
Total amount of Shared Memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per SM: 2048
Maximum number of threads per block: 1024
Maximum size of each dimension of a block: 1024 x 1024 x 64
Maximum size of each dimension of a grid: 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignmemt: 32 bytes
Clock rate: 0.99 GHz
Memory Clock rate: 1001 MHz
Memory Bus Width: 64-bit
Device0: "GeForce 920MX" #显卡型号
CUDA Driver Version: 11.0 # 显卡驱动的版本号
CUDA Runtime Version: 9.2 # CUDA toolkit版本号
Device Prop: 5.0 # 显卡的计算能力(Compute Capability)
Total amount of Global Memory: 2147483648 bytes # 显存大小
显卡计算能力查询网站:https://developer.nvidia.com/cuda-gpus
其他的字段后面用到了再介绍!