NVIDIA系统管理界面(nvidia-smi)是基于NVIDIA Management Library(NVML)的命令行实用程序,旨在帮助管理和监视NVIDIA GPU设备。
nvidia-smi
Sun Mar 28 02:40:38 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... On | 00000000:02:00.0 Off | N/A |
| 23% 29C P8 9W / 250W | 611MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... On | 00000000:03:00.0 Off | N/A |
| 23% 30C P8 9W / 250W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... On | 00000000:82:00.0 Off | N/A |
| 23% 30C P8 9W / 250W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... On | 00000000:83:00.0 Off | N/A |
| 23% 30C P8 9W / 250W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 33777 C /usr/bin/python 601MiB |
+-----------------------------------------------------------------------------+
这是GEFORCE GTX 1080 Ti
GPU服务器的运行信息。
注
:显存占用和GPU占用是两个不一样的,显卡是由GPU和显存等组成的,显存和GPU的关系可简单理解为内存和CPU的关系。
nvidia-smi -L
从左到右分别为:GPU卡号、GPU型号、GPU物理UUID号
GPU 0: GeForce GTX 1080 Ti (UUID: GPU-5da6e67e-fd5a-88fb-7a0e-109c3284f7bf)
GPU 1: GeForce GTX 1080 Ti (UUID: GPU-ce9189e4-2e58-3a19-4332-cb5c7fac1aa6)
GPU 2: GeForce GTX 1080 Ti (UUID: GPU-242b3020-8e5c-813a-42d9-475766d52f9d)
GPU 3: GeForce GTX 1080 Ti (UUID: GPU-8f3d825f-7246-3daf-eaa1-37845b03aa03)
单独过滤出GPU卡号信息
nvidia-smi -L | cut -d ' ' -f 2 | cut -c 1
解决GPU启动加载慢问题
设置GPU持续模式:Persistence-M
sudo nvidia-smi -pm 1
解决卡性能不均匀问题,如果是四卡机器,只使用两个节点优先选择0和3
,边界卡槽有利于散热