Windows RuntimeError: CUDA out of memory.

问题描述

RuntimeError: CUDA out of memory. Tried to allocate 244.00 MiB (GPU 0; 2.00 GiB total capacity; 1014.91 MiB already allocated; 0 bytes free; 1.19 GiB reserved in total by PyTorch)

Windows 报错CUDA超出内存,但是GPU利用率为0

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 471.11       Driver Version: 471.11       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P620        WDDM  | 00000000:01:00.0  On |                  N/A |
| 34%   42C    P8    N/A /  N/A |    315MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     21640    C+G   ...2\jbr\bin\jcef_helper.exe    N/A      |
|    0   N/A  N/A     26164    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     27460    C+G   ...me\Application\chrome.exe    N/A      |
|    0   N/A  N/A     28132    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     28696    C+G   ...w5n1h2txyewy\SearchUI.exe    N/A      |
|    0   N/A  N/A     32552    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A     33388    C+G   C:\Windows\explorer.exe         N/A      |
+-----------------------------------------------------------------------------+

原因分析

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     21640    C+G   ...2\jbr\bin\jcef_helper.exe    N/A      |
|    0   N/A  N/A     26164    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     27460    C+G   ...me\Application\chrome.exe    N/A      |
|    0   N/A  N/A     28132    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     28696    C+G   ...w5n1h2txyewy\SearchUI.exe    N/A      |
|    0   N/A  N/A     32552    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A     33388    C+G   C:\Windows\explorer.exe         N/A      |
+-----------------------------------------------------------------------------+

虽然GPU利用率为0,但是还有很多进程在占用GPU的内存

解决方案:

杀死不必要的内存

1.查看所有进程的端口使用信息

Windows键 + R ,在弹出的运行窗口中输入cmd,在cmd窗口中输入netstat -ano显示协议统计信息和当前的端口监听连接。

-a 显示所有连接和侦听端口。

-n 以数字形式显示地址和端口号。

-o 显示拥有的与每个连接关联的进程 ID。

netstat -o

若是查看GPU的进程占用情况,Windows键 + R ,在弹出的运行窗口中输入cmd,在cmd中输入nvidia-smi

nvidia-smi

2.杀死不必要的内存

taskkill /f /PID 21640
taskkill /f /PID 26164
taskkill /f /PID 27460
taskkill /f /PID 28132
taskkill /f /PID 28696
taskkill /f /PID 32552
taskkill /f /PID 33388

3.linux杀死进程用

kill -9 PID(进程号)

其它解决办法

如果不是上面的情况导致的CUDA out of memory,那么尝试以下办法:

1.把Batch_size改小
2.换显存大的GPU,或者启用多块GPU

参考:参考链接

你可能感兴趣的:(环境配置)