模型训练!!!
- 设置好训练集和验证集,便可让网络模型自发的在相应的硬件资源下进行训练,但模型训练如何停止?
- 通常设置一定的截止条件,而这些条件便是以模型整体的准确度、精度等评价指标来设置不同参数值。
- 常用的评价指标:置信度、准确度、召回率、mAP
- 下边将以yolov5为例详解评价指标对模型性能进行分析
以下结果皆在训练时将通过命令窗口输出
训练前的参数设置:
train_gar: weights=models/best.pt, cfg=, data=data/garbage.yaml, hyp=data\hyps\hyp.scratch-low.yaml, epochs=300, batch_size=2, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs\train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
train_gar: weights=models/best.pt, cfg=, data=data/garbage.yaml, hyp=data\hyps\hyp.scratch-low.yaml, epochs=300, batch_size=2, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs\train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
github: skipping check (not a git repository), for updates see https://github.com/ultralytics/yolov5
YOLOv5 2022-9-18 Python-3.8.13 torch-1.7.0+cu101 CUDA:0 (NVIDIA GeForce GTX 1050, 2048MiB)
YOLOv5 2022-9-18 Python-3.8.13 torch-1.7.0+cu101 CUDA:0 (NVIDIA GeForce GTX 1050, 2048MiB)
hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 runs (RECOMMENDED)
Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 runs (RECOMMENDED)
TensorBoard: Start with 'tensorboard --logdir runs\train', view at http://localhost:6006/
TensorBoard: Start with 'tensorboard --logdir runs\train', view at http://localhost:6006/
模型参数如下:
from n params module arguments
0 -1 1 3520 MyYOLO.models.common.Conv [3, 32, 6, 2, 2]
0 -1 1 3520 MyYOLO.models.common.Conv [3, 32, 6, 2, 2]
1 -1 1 18560 MyYOLO.models.common.Conv [32, 64, 3, 2]
1 -1 1 18560 MyYOLO.models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 MyYOLO.models.common.C3 [64, 64, 1]
2 -1 1 18816 MyYOLO.models.common.C3 [64, 64, 1]
3 -1 1 73984 MyYOLO.models.common.Conv [64, 128, 3, 2]
3 -1 1 73984 MyYOLO.models.common.Conv [64, 128, 3, 2]
4 -1 2 115712 MyYOLO.models.common.C3 [128, 128, 2]
4 -1 2 115712 MyYOLO.models.common.C3 [128, 128, 2]
5 -1 1 295424 MyYOLO.models.common.Conv [128, 256, 3, 2]
5 -1 1 295424 MyYOLO.models.common.Conv [128, 256, 3, 2]
6 -1 3 625152 MyYOLO.models.common.C3 [256, 256, 3]
6 -1 3 625152 MyYOLO.models.common.C3 [256, 256, 3]
7 -1 1 1180672 MyYOLO.models.common.Conv [256, 512, 3, 2]
7 -1 1 1180672 MyYOLO.models.common.Conv [256, 512, 3, 2]
8 -1 1 1182720 MyYOLO.models.common.C3 [512, 512, 1]
8 -1 1 1182720 MyYOLO.models.common.C3 [512, 512, 1]
9 -1 1 656896 MyYOLO.models.common.SPPF [512, 512, 5]
9 -1 1 656896 MyYOLO.models.common.SPPF [512, 512, 5]
10 -1 1 131584 MyYOLO.models.common.Conv [512, 256, 1, 1]
10 -1 1 131584 MyYOLO.models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 MyYOLO.models.common.Concat [1]
12 [-1, 6] 1 0 MyYOLO.models.common.Concat [1]
13 -1 1 361984 MyYOLO.models.common.C3 [512, 256, 1, False]
13 -1 1 361984 MyYOLO.models.common.C3 [512, 256, 1, False]
14 -1 1 33024 MyYOLO.models.common.Conv [256, 128, 1, 1]
14 -1 1 33024 MyYOLO.models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 MyYOLO.models.common.Concat [1]
16 [-1, 4] 1 0 MyYOLO.models.common.Concat [1]
17 -1 1 90880 MyYOLO.models.common.C3 [256, 128, 1, False]
17 -1 1 90880 MyYOLO.models.common.C3 [256, 128, 1, False]
18 -1 1 147712 MyYOLO.models.common.Conv [128, 128, 3, 2]
18 -1 1 147712 MyYOLO.models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 MyYOLO.models.common.Concat [1]
19 [-1, 14] 1 0 MyYOLO.models.common.Concat [1]
20 -1 1 296448 MyYOLO.models.common.C3 [256, 256, 1, False]
20 -1 1 296448 MyYOLO.models.common.C3 [256, 256, 1, False]
21 -1 1 590336 MyYOLO.models.common.Conv [256, 256, 3, 2]
21 -1 1 590336 MyYOLO.models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 MyYOLO.models.common.Concat [1]
22 [-1, 10] 1 0 MyYOLO.models.common.Concat [1]
23 -1 1 1182720 MyYOLO.models.common.C3 [512, 512, 1, False]
23 -1 1 1182720 MyYOLO.models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 35061 models.yolo.Detect [8, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
24 [17, 20, 23] 1 35061 models.yolo.Detect [8, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model summary: 270 layers, 7041205 parameters, 7041205 gradients, 16.0 GFLOPs
Model summary: 270 layers, 7041205 parameters, 7041205 gradients, 16.0 GFLOPs
权重优化设置:
Transferred 349/349 items from models\best.pt
Transferred 349/349 items from models\best.pt
#此处发现自动混合权重失败,但不影响后边训练
AMP: checks failed , disabling Automatic Mixed Precision. See https://github.com/ultralytics/yolov5/issues/7908
AMP: checks failed , disabling Automatic Mixed Precision. See https://github.com/ultralytics/yolov5/issues/7908
Scaled weight_decay = 0.0005
Scaled weight_decay = 0.0005
optimizer: SGD with parameter groups 57 weight (no decay), 60 weight, 60 bias
optimizer: SGD with parameter groups 57 weight (no decay), 60 weight, 60 bias
train: Scanning 'F:\Deeplearning\yolov5-master\data\train\labels.cache' images and labels... 5400 found, 0 missing, 0 empty, 0 corrupt: 100%|██████████| 5400/5400 [00:00<?, ?it/s]
val: Scanning 'F:\Deeplearning\yolov5-master\data\val\labels.cache' images and labels... 1136 found, 0 missing, 0 empty, 0 corrupt: 100%|██████████| 1136/1136 [00:00<?, ?it/s]
Plotting labels to runs\train\exp11\labels.jpg...
Plotting labels to runs\train\exp11\labels.jpg...
开始训练过程如下:设置的epoch=300,如果在迭代训练300次过程中出现收敛,则会提前停止训练
AutoAnchor: 5.02 anchors/target, 1.000 Best Possible Recall (BPR). Current anchors are a good fit to dataset
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to runs\train\exp11
Starting training for 300 epochs...
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to runs\train\exp11
Starting training for 300 epochs...
Epoch gpu_mem box obj cls labels img_size
Epoch gpu_mem box obj cls labels img_size
0/299 0.868G 0.02972 0.02851 0.002344 30 640: 100%|██████████| 2700/2700 [08:18<00:00, 5.41it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 284/284 [00:22<00:00, 12.67it/s]
all 1136 7338 0.887 0.762 0.818 0.565
all 1136 7338 0.887 0.762 0.818 0.565
Epoch gpu_mem box obj cls labels img_size
Epoch gpu_mem box obj cls labels img_size
1/299 0.868G 0.03136 0.0298 0.002388 29 640: 100%|██████████| 2700/2700 [08:17<00:00, 5.42it/s]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 284/284 [00:22<00:00, 12.82it/s]
all 1136 7338 0.884 0.748 0.812 0.535
all 1136 7338 0.884 0.748 0.812 0.535
训练结束后的输出提示:
Stopping training early as no improvement observed in last 100 epochs. Best results observed at epoch 0, best model saved as best.pt.
To update EarlyStopping(patience=100) pass a new patience value, i.e. `python train.py --patience 300` or use `--patience 0` to disable EarlyStopping.
Stopping training early as no improvement observed in last 100 epochs. Best results observed at epoch 0, best model saved as best.pt.
To update EarlyStopping(patience=100) pass a new patience value, i.e. `python train.py --patience 300` or use `--patience 0` to disable EarlyStopping.
101 epochs completed in 14.392 hours.
101 epochs completed in 14.392 hours.
Optimizer stripped from runs\train\exp11\weights\last.pt, 14.4MB
Optimizer stripped from runs\train\exp11\weights\last.pt, 14.4MB
Optimizer stripped from runs\train\exp11\weights\best.pt, 14.4MB
Optimizer stripped from runs\train\exp11\weights\best.pt, 14.4MB
Validating runs\train\exp11\weights\best.pt...
训练结果打印如下:
Validating runs\train\exp11\weights\best.pt...
Fusing layers...
Fusing layers...
Model summary: 213 layers, 7031701 parameters, 0 gradients, 15.8 GFLOPs
Model summary: 213 layers, 7031701 parameters, 0 gradients, 15.8 GFLOPs
# 类别 图像 标签 准确度 召回率 平均准确度
Class Images Labels P R mAP@.5 mAP@.5:.95: 100%|██████████| 284/284 [00:25<00:00, 11.03it/s]
all 1136 7338 0.887 0.762 0.818 0.565
all 1136 7338 0.887 0.762 0.818 0.565
person 1136 1572 0.981 0.958 0.978 0.842
person 1136 1572 0.981 0.958 0.978 0.842
face 1136 258 0.9 0.596 0.729 0.382
face 1136 258 0.9 0.596 0.729 0.382
garb 1136 3511 0.909 0.774 0.858 0.589
garb 1136 3511 0.909 0.774 0.858 0.589
larwas 1136 87 0.803 0.561 0.569 0.441
larwas 1136 87 0.803 0.561 0.569 0.441
conwas 1136 65 0.841 0.938 0.961 0.584
conwas 1136 65 0.841 0.938 0.961 0.584
recyc 1136 1845 0.892 0.745 0.812 0.552
recyc 1136 1845 0.892 0.745 0.812 0.552
Results saved to runs\train\exp11
Results saved to runs\train\exp11
其中,训练后模型的性能结果及训练好的权重放在默认路径下:
参考:YOLO-V5训练结果的分析与评价:https://blog.csdn.net/weixin_45751396/article/details/126726120
voc2007数据集下的结果评价: 权重文件为:yolov5m。我只训练了10轮
检测精度 | 检测速度 |
---|---|
Precision,Recall,F1 score | 前传耗时 |
IoU(Intersection over Union) | 每秒帧数 FPS(Frames Per Sencond) |
P-R curve | 浮点运算量(FLOPS) |
AP、mAP |
检测速度
检测精度 下面会逐步介绍
在机器学习领域和统计分类问题中,混淆矩阵(英语:confusion matrix)是可视化工具,特别用于监督学习,在无监督学习一般叫做匹配矩阵。矩阵的每一列代表一个类的实例预测,而每一行表示一个实际的类的实例。之所以如此命名,是因为通过这个矩阵可以方便地看出机器是否将两个不同的类混淆了(比如说把一个类错当成了另一个)。
涉及到准确率(查准率)和召回率(查全率)。这一块内容其他博主介绍也十分详细。
precision(单一类准确率) : 预测为positive的准确率。
准确率和置信度的关系图。
意思就是,当我设置置信度为某一数值的时候,各个类别识别的准确率。可以看到,当置信度越大的时候,类别检测的越准确。这也很好理解,只有confidence很大,才被判断是某一类别。但也很好想到,这样的话,会漏检一些置信度低的类别。
recall(真实为positive的准确率),即正样本有多少被找出来了(召回了多少)。
召回率(查全率)和置信度的关系图。
意思就是,当我设置置信度为某一数值的时候,各个类别查全的概率。可以看到,当置信度越小的时候,类别检测的越全面。
mAP 是 Mean Average Precision 的缩写,即 均值平均精度。可以看到:精度越高,召回率越低。
但我们希望我们的网络,在准确率很高的前提下,尽可能的检测到全部的类别。所以希望我们的曲线接近(1,1)点,即希望mAP曲线的面积尽可能接近1。
F1分数(F1-score)是分类问题的一个衡量指标。一些多分类问题的机器学习竞赛,常常将F1-score作为最终测评的方法。它是精确率和召回率的调和平均数,最大为1,最小为0。
对于某个分类,综合了Precision和Recall的一个判断指标,F1-Score的值是从0到1的,1是最好,0是最差。
???这个框体的输出结果是做什么用的,我也没搜到…
损失函数是用来衡量模型预测值和真实值不一样的程度,极大程度上决定了模型的性能。
Box:YOLOV5使用 GIOU loss作为bounding box的损失,Box推测为GIoU损失函数均值越小,方框越准
Objectness:推测为目标检测loss均值,越小目标检测越准
Classification:推测为分类loss均值,越小分类越准
val BOX: 验证集bounding box损失
val Objectness:验证集目标检测loss均值
val classification:验证集分类loss均值
Precision:精度(找对的正类/所有找到的正类)
Recall:真实为positive的准确率,即正样本有多少被找出来了(召回了多少)
[email protected]:0.95(mAP@[0.5:0.95])
表示在不同IoU阈值(从0.5到0.95,步长0.05)(0.5、0.55、0.6、0.65、0.7、0.75、0.8、0.85、0.9、0.95)上的平均mAP。
[email protected]:表示阈值大于0.5的平均mAP
目标检测中PR曲线和mAP
results.txt中最后三列是验证集结果,前面的是训练集结果,全部列分别是:
训练次数,GPU消耗,边界框损失,目标检测损失,分类损失,total,targets,图片大小,P,R,[email protected], [email protected]:.95, 验证集val Box, 验证集val obj, 验证集val cls
这里我设置的一个batchsize是8,所以一次读了8张照片
val_batchx_labels:验证集第x轮的实际标签
val_batchx_pred:验证集第x轮的预测标签