syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory

1.问题描述

 今天用faster-rcnn跑数据模型报错:

prepared the input data
F0531 13:41:25.938465 12409 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
    @     0x7f23e6c985cd  google::LogMessage::Fail()
    @     0x7f23e6c9a433  google::LogMessage::SendToLog()
    @     0x7f23e6c9815b  google::LogMessage::Flush()
    @     0x7f23e6c9ae1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f23e7d89420  caffe::SyncedMemory::to_gpu()
    @     0x7f23e7d883e9  caffe::SyncedMemory::mutable_gpu_data()
    @     0x7f23e7d8a782  caffe::Blob<>::mutable_gpu_data()
    @     0x7f23e7dc5ee8  caffe::BaseConvolutionLayer<>::forward_gpu_gemm()
    @     0x7f23e7f15616  caffe::ConvolutionLayer<>::Forward_gpu()
    @     0x7f23e7ebb7b2  caffe::Net<>::ForwardFromTo()
    @     0x7f23e7ebb906  caffe::Net<>::ForwardPrefilled()
    @     0x7f23e84be176  Detector::Detect()
    @           0x442990  FasterDetect()
    @           0x440de4  PredictContainer2()
    @           0x442dd1  imagedeal()
    @           0x442e67  main
    @     0x7f23e5989830  __libc_start_main
    @           0x4404e9  _start
    @              (nil)  (unknown)
已放弃 (核心已转储)

2.解决过程

这种错误先看了下cpu内存,top命令查看了下发现内存够用,然后以为是图像数据有问题,或者矩阵运算有内存越界等错误,忙了大半天,没有找出原因,后来研究了下caffe,运算好像是在显示芯片上面运算的(GPU),查看下内存命令:

john@ubun:~/Project/Serverbk$nvidia-smi
Fri May 31 13:41:32 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130                Driver Version: 384.130                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:65:00.0  On |                  N/A |
|  0%   58C    P2    62W / 250W |   8413MiB / 11169MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1047      G   /usr/lib/xorg/Xorg                           221MiB |
|    0      1822      G   compiz                                        61MiB |
|    0      2798      C   ./Server                                     189MiB |
|    0      3424      C   ./Imagedeal                                 6981MiB |
|    0      3513      C   ./Server                                     189MiB |
|    0      3563      C   ./Imagedeal                                  189MiB |
|    0      4186      C   ./Server                                     189MiB |
|    0      9778      C   ./Server                                     189MiB |

|    0     10094      G   /usr/lib/firefox/firefox                       2MiB |
|    0     11819      C   ./Imagedeal                                  189MiB |
+-----------------------------------------------------------------------------+

然后杀了下进程:

john@ubun:~/Project/Serverbk$kill %

[8]+  已停止               ./Imagedeal
john@ubun:~/Project/Serverbk$
[8]+  已终止               ./Imagedeal
john@ubun:~/Project/Serverbk$
john@ubun:~/Project/Serverbk$
john@ubun:~/Project/Serverbk$
john@ubun:~/Project/Serverbk$nvidia-smi
Fri May 31 13:49:36 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130                Driver Version: 384.130                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:65:00.0  On |                  N/A |
|  0%   51C    P0    63W / 250W |    282MiB / 11169MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1047      G   /usr/lib/xorg/Xorg                           214MiB |
|    0      1822      G   compiz                                        62MiB |
|    0     10094      G   /usr/lib/firefox/firefox                       2MiB |
+-----------------------------------------------------------------------------+

问题解决!

最后再写个sh脚本,让程序运行前先杀一下后台程序!

你可能感兴趣的:(caffe,caffe,GPU)