目录
准备数据:
开始训练:
查看训练结果:
测试训练好的模型:
小结:
安装设置好YOLOv5的环境,找一个小的数据集,测试一下train,看看RTX3070的训练效果。
下载 COCO128, 一个小的128图像的教程数据集,并解压在与yolov5相同的路径下。
(yolov5) C:\yolo\yolov5>python train.py --img 640 --batch 6 --epochs 16 --data data/coco128.yaml --weights weights/yolov5s.pt --nosave --cache
github: skipping check (not a git repository)
YOLOv5 torch 1.7.1 CUDA:0 (GeForce RTX 3070 Laptop GPU, 8192.0MB)
Namespace(adam=False, batch_size=6, bucket='', cache_images=True, cfg='', data='data/coco128.yaml', device='', epochs=16, evolve=False, exist_ok=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[640, 640], linear_lr=False, local_rank=-1, log_artifacts=False, log_imgs=16, multi_scale=False, name='exp', noautoanchor=False, nosave=True, notest=False, project='runs/train', quad=False, rect=False, resume=False, save_dir='runs\\train\\exp20', single_cls=False, sync_bn=False, total_batch_size=6, weights='weights/yolov5s.pt', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir runs/train", view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
from n params module arguments
0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 1 156928 models.common.C3 [128, 128, 3]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 1 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
9 -1 1 1182720 models.common.C3 [512, 512, 1, False]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 90880 models.common.C3 [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 296448 models.common.C3 [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 229245 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 283 layers, 7276605 parameters, 7276605 gradients, 17.1 GFLOPS
Transferred 362/362 items from weights/yolov5s.pt
Scaled weight_decay = 0.000515625
Optimizer groups: 62 .bias, 62 conv.weight, 59 other
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
[34m[1mwandb[0m: Enter your choice: 3
wandb: You chose 'Don't visualize my results'
wandb: Offline run mode, not syncing to the cloud.
wandb: W&B syncing is set to `offline` in this directory. Run `wandb online` to enable cloud syncing.
train: Scanning '..\coco128\labels\train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|█████████████████████████████| 128/128 [00:00, ?it/s]
train: Caching images (0.1GB): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:00<00:00, 1085.08it/s]
val: Scanning '..\coco128\labels\train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|███████████████████████████████| 128/128 [00:00, ?it/s]
val: Caching images (0.1GB): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:00<00:00, 884.20it/s]
Plotting labels... (0.1GB): 87%|█████████████████████████████████████████████████████████████████████████████████████████████████▏ | 111/128 [00:00<00:00, 952.78it/s]
autoanchor: Analyzing anchors... anchors/target = 4.26, Best Possible Recall (BPR) = 0.9946
Image sizes 640 train, 640 test
Using 6 dataloader workers
Logging results to runs\train\exp20
Starting training for 16 epochs...
Epoch gpu_mem box obj cls total targets img_size P R [email protected] [email protected]:.95 val/box_loss val/obj_loss val/cls_loss
0/15 2.47G 0.0427 0.07469 0.02225 0.1396 28 640 0.6758 0.5909 0.6581 0.426 0.03921 0.04349 0.01277
1/15 2.53G 0.04571 0.07464 0.02542 0.1458 28 640 0.6869 0.6051 0.6645 0.4353 0.03906 0.04335 0.01244
2/15 2.53G 0.0452 0.06835 0.02362 0.1372 45 640 0.6903 0.6155 0.6701 0.4412 0.03872 0.04302 0.01207
3/15 2.53G 0.04127 0.06385 0.02236 0.1275 22 640 0.6888 0.6434 0.686 0.4485 0.03845 0.04248 0.0115
4/15 2.53G 0.04009 0.06566 0.0244 0.1301 6 640 0.7213 0.646 0.6943 0.4574 0.03818 0.04227 0.01077
5/15 2.53G 0.0435 0.06908 0.01939 0.132 49 640 0.7007 0.6615 0.7026 0.4653 0.03774 0.04175 0.01008
6/15 2.53G 0.04168 0.06535 0.01861 0.1256 47 640 0.7149 0.6687 0.7087 0.4772 0.03732 0.04117 0.009621
7/15 2.53G 0.04235 0.06387 0.01916 0.1254 29 640 0.7247 0.6836 0.723 0.4818 0.03711 0.04064 0.009062
8/15 2.53G 0.04326 0.07039 0.01931 0.133 53 640 0.7139 0.7008 0.732 0.4904 0.03689 0.03998 0.008568
9/15 2.53G 0.04003 0.06277 0.01667 0.1195 13 640 0.7511 0.6914 0.7406 0.4981 0.03657 0.03946 0.008187
10/15 2.53G 0.03966 0.06666 0.01602 0.1223 31 640 0.7257 0.71 0.7572 0.5103 0.03618 0.03924 0.007936
11/15 2.53G 0.04282 0.06147 0.01623 0.1205 21 640 0.8134 0.6608 0.7613 0.5162 0.03588 0.03898 0.007673
12/15 2.53G 0.04107 0.0633 0.01663 0.121 37 640 0.8086 0.6713 0.7706 0.522 0.03571 0.03846 0.007505
13/15 2.53G 0.04147 0.06159 0.01675 0.1198 50 640 0.8059 0.672 0.7707 0.5219 0.03569 0.03823 0.007398
14/15 2.53G 0.04068 0.06058 0.01719 0.1185 45 640 0.8475 0.6541 0.773 0.5251 0.03552 0.03794 0.00727
15/15 2.53G 0.04068 0.05729 0.01563 0.1136 26 640 0.8296 0.6865 0.7775 0.53 0.03533 0.03783 0.007028
Optimizer stripped from runs\train\exp20\weights\last.pt, 14.8MB
Optimizer stripped from runs\train\exp20\weights\best.pt, 14.8MB
Images sizes do not match. This will causes images to be display incorrectly in the UI.
16 epochs completed in 0.025 hours.
wandb: Waiting for W&B process to finish, PID 13004
wandb: Program ended successfully.
wandb: Run summary:
wandb: train/box_loss 0.04068
wandb: train/obj_loss 0.05729
wandb: train/cls_loss 0.01563
wandb: metrics/precision 0.82962
wandb: metrics/recall 0.6865
wandb: metrics/mAP_0.5 0.77755
wandb: metrics/mAP_0.5:0.95 0.53001
wandb: val/box_loss 0.03533
wandb: val/obj_loss 0.03783
wandb: val/cls_loss 0.00703
wandb: x/lr0 0.00073
wandb: x/lr1 0.00073
wandb: x/lr2 0.06563
wandb: _runtime 117
wandb: _timestamp 1613960075
wandb: _step 16
wandb: Run history:
wandb: train/box_loss ▅█▇▃▂▅▃▄▅▁▁▅▃▃▂▂
wandb: train/obj_loss ██▅▄▄▆▄▄▆▃▅▃▃▃▂▁
wandb: train/cls_loss ▆█▇▆▇▄▃▄▄▂▁▁▂▂▂▁
wandb: metrics/precision ▁▁▂▂▃▂▃▃▃▄▃▇▆▆█▇
wandb: metrics/recall ▁▂▂▄▄▅▆▆▇▇█▅▆▆▅▇
wandb: metrics/mAP_0.5 ▁▁▂▃▃▄▄▅▅▆▇▇████
wandb: metrics/mAP_0.5:0.95 ▁▂▂▃▃▄▄▅▅▆▇▇▇▇██
wandb: val/box_loss ██▇▇▆▅▅▄▄▃▃▂▂▂▁▁
wandb: val/obj_loss ██▇▇▆▆▅▄▄▃▃▂▂▂▁▁
wandb: val/cls_loss ██▇▆▆▅▄▃▃▂▂▂▂▁▁▁
wandb: x/lr0 ▁▃▄▅▆▇████▇▇▆▅▅▅
wandb: x/lr1 ▁▃▄▅▆▇████▇▇▆▅▅▅
wandb: x/lr2 ██▇▇▆▆▅▅▄▄▃▃▂▂▁▁
wandb: _runtime ▁▁▂▂▃▃▄▄▄▅▅▆▆▇▇██
wandb: _timestamp ▁▁▂▂▃▃▄▄▄▅▅▆▆▇▇██
wandb: _step ▁▁▂▂▃▃▄▄▅▅▅▆▆▇▇██
wandb:
将训练好的最佳结果:best.pt,Copy到Weights路径下,进行测试:
(yolov5) C:\yolo\yolov5>python detect.py --source data/images --weights weights/best.pt --conf 0.25
Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', exist_ok=False, img_size=640, iou_thres=0.45, name='exp', project='runs/detect', save_conf=False, save_txt=False, source='data/images', update=False, view_img=False, weights=['weights/best.pt'])
YOLOv5 torch 1.7.1 CUDA:0 (GeForce RTX 3070 Laptop GPU, 8192.0MB)
Fusing layers...
Model Summary: 224 layers, 7266973 parameters, 0 gradients, 17.0 GFLOPS
image 1/3 C:\yolo\yolov5\data\images\bus.jpg: 640x480 3 persons, 1 bus, Done. (0.022s)
image 2/3 C:\yolo\yolov5\data\images\gj.jpg: 480x640 Done. (0.016s)
image 3/3 C:\yolo\yolov5\data\images\zidane.jpg: 384x640 2 persons, 1 tie, Done. (0.016s)
Results saved to runs\detect\exp69
Done. (0.239s)
bus.jpg
开始取batch 16的时候,显存不够用,改为batch 6就可以进行训练了;--batch 6 --epochs 16的训练用了0.025 小时,也就是1.5分钟。估计epochs 600大约需要1小时。
老徐 2021/2/22