yolov3--17--yolo-mobilenetv2-调试错误总结

Yolov-1-TX2上用YOLOv3训练自己数据集的流程(VOC2007-TX2-GPU)

Yolov--2--一文全面了解深度学习性能优化加速引擎---TensorRT

Yolov--3--TensorRT中yolov3性能优化加速(基于caffe)

yolov-5-目标检测:YOLOv2算法原理详解

yolov--8--Tensorflow实现YOLO v3

yolov--9--YOLO v3的剪枝优化

yolov--10--目标检测模型的参数评估指标详解、概念解析

yolov--11--YOLO v3的原版训练记录、mAP、AP、recall、precision、time等评价指标计算

yolov--12--YOLOv3的原理深度剖析和关键点讲解

yolov--14--轻量级模型MobilenetV2网络结构解析--概念解读

yolov--15--史上最详细的Yolov3边框预测分析--改进

yolov3--16--一文详解卷积操作中的padding填充操作


 

CUDA_VISIBLE_DEVICES=4 python train.py --gpu=4 &

 调试错误1

(pytorch1.1.0-py2.7_cuda9.0) Liqing@user-ubuntu:~/hangyu/stronger-yolo-c/v3$ WARNING:tensorflow:From /home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-11-01 23:38:28.294675: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-01 23:38:28.324619: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299985000 Hz
2019-11-01 23:38:28.330328: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x7f09f4ce3070 executing computations on platform Host. Devices:
2019-11-01 23:38:28.330377: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): , 
WARNING:tensorflow:From /home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from weights/mobilenet_v2_1.0_224.ckpt
Reading annotation for 1/181
Reading annotation for 101/181
Saving cached annotations to /home/Liqing/hangyu/stronger-yolo-c/v3/eval/cache/annots.pkl
/home/Liqing/hangyu/stronger-yolo-c/v3/eval/voc_eval.py:194: RuntimeWarning: invalid value encountered in divide
  rec = tp / float(npos)
Reading annotation for 1/181
Reading annotation for 101/181
Saving cached annotations to /home/Liqing/hangyu/stronger-yolo-c/v3/eval/cache/annots.pkl
Reading annotation for 1/181

yolov3--17--yolo-mobilenetv2-调试错误总结_第1张图片

yolov3--17--yolo-mobilenetv2-调试错误总结_第2张图片


[0. 0. 0. ... 0. 0. 0.]
[nan nan nan ... nan nan nan]
nan
yolov3--17--yolo-mobilenetv2-调试错误总结_第3张图片

    # compute precision recall
    fp = np.cumsum(fp)
    tp = np.cumsum(tp)
    print tp   #add
    rec = tp / float(npos)
    print rec  #add
    # avoid divide by zero in case the first detection matches a difficult
    # ground truth
    prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
    ap = voc_ap(rec, prec, use_07_metric)
    print ap  #add
    return rec, prec, ap

yolov3--17--yolo-mobilenetv2-调试错误总结_第4张图片

yolov3--17--yolo-mobilenetv2-调试错误总结_第5张图片


yolov3--17--yolo-mobilenetv2-调试错误总结_第6张图片


yolov3--17--yolo-mobilenetv2-调试错误总结_第7张图片


yolov3--17--yolo-mobilenetv2-调试错误总结_第8张图片


WARNING:tensorflow:From /home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-11-14 21:11:32.104982: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-14 21:11:32.117354: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299985000 Hz
2019-11-14 21:11:32.122381: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x7f337ba03df0 executing computations on platform Host. Devices:
2019-11-14 21:11:32.122419: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): , 
WARNING:tensorflow:From /home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from weights/mobilenet_v2_1.0_224.ckpt
2019-11-14 21:53:59.213951: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_v2_ops.cc:134 : Resource exhausted: weights/yolo.ckpt-1.data-00000-of-00001.tempstate5297156131404480268; No space left on device
Traceback (most recent call last):
  File "train.py", line 159, in 
    Yolo_train().train()
  File "train.py", line 149, in train
    self.__save.save(self.__sess, os.path.join(self.__weights_dir, 'yolo.ckpt-%d' % period))
  File "/home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1171, in save
    {self.saver_def.filename_tensor_name: checkpoint_file})
  File "/home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/Liqing/anaconda3/envs/pytorch1.1.0-py2.7_cuda9.0/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: weights/yolo.ckpt-1.data-00000-of-00001.tempstate5297156131404480268; No space left on device
         [[node load_save/save_1/SaveV2 (defined at train.py:80) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

解决:

自己数据集label中类别大小写与训练类别不一致问题(统一改为小写)

你可能感兴趣的:(yolov3-lite)