yolov5-master代码详解笔记——detect模块

本文以detec.py文件为主,从头开始逐一追踪代码,了解detect运行流程

目录

detect.py:

common.py(models):

    DetectMultiBackend:       (line279)

datasets.py(utils):

       LoadImages:                      (line178)

augmentations.py(utils):

       letterbox:                                  (line91)

plots.py(utils):

       Annotator:                               (line68)


detect.py:

FILE                当前文件绝对路径

ROOT              整个yolov5项目的路径(多数情况下在文件的下载转移更新时已存在包导入时无法查找,则可查看该路径是否正确)

Parse_opt        定参,返回opt(存储所有参数信息)

Main                 ①检测requirement中依赖包

                         ②执行Run

Run                   1.判断source传入数据

  • is_file: 判断输入图片格式是否在设定格式中(dataset.py)

                                ②webcam:false

                         2. Directories,新建保存结果文件夹

                         3. Load model,加载模型

                                ①device:选择设备,摄像头、GPU、CPU等

  • model:(weight,coco.yaml)显示后端框架(pytorch、TorchScri等)

                            DetectMultiBackend(common.py)

③加载模型数据

④imgsz保证图片尺寸为32的倍数,不是则自动计算出32倍数尺寸

                         4. Dataloader,加载待预测图片

                                dataset=LoadImages(datasets.py)  初始化

                         5. Run inference,输入模型推理产生推理结果画出识别框:

                                初始化:

                                       Warmup:传入一张空图片到GPU预热

                                       遍历dataset(LoadImages):

                                       im: 图片numpy转pytroch支持的格式

                                              /=255:归一化

                                       扩张维度

                                Inference: 预测

                                       visualize(默认false):若为true,保存推断过程特征图

                                       pred:                         检测框

augment:可对推断做数据增强,但降低模型运行速度

[1,18900,85]:85指4个坐标信息,1个置信度,80个类别

概率

                                NMS:   非极大值过滤

                                       pred: 1,5,6: 6指4坐标,1置信度,1类别

                                Process:

                                       det: [5,6],5个矩形框, 6指4坐标,1置信度,1类别

                                       seen:计数器

                                       save_path:图片保存路径

                                       txt_path:默认不保存txt文件

                                       s:

                                       gn:获得原图宽高,保存txt时有用

                                       imc:判断是否把检测框裁剪保存

                                       annotator(plots.py):原图绘制

                                       if(det):   画框

                                              det[]       从调整图中坐标映射回原图

                                              遍历所有框:n统计所有框->s打印信息

                                       write results:选择保存方式

                                              add bbox to image(默认选择):

                                                     label:hide_labels、hide_conf(detect参数)是否打印

                                                            标签、置信度

                                                     annotator.box_label:画框

                                                     save_crop:默认false,是否保存截取检测框

                                       stream:       (view_img)展示结果

                                       save_img:    保存图片

                         6. Print results,打印输出结果

                                       t:  统计预测每张图片平均时间

                                              seen:预测图片数量,dt每张图片所用时间

                                              LOGGER.info:日志

common.py(models):

    DetectMultiBackend:       (line279)

      w                     判断weights是否为list,若是取出第一个值作为传入路径

model_type       判断模型格式(pt、jit等),执行相应加载模式

fp16                半精度推算

if data               加载传入文件,获取names

datasets.py(utils):

       LoadImages:                      (line178)

         P:                            根据相对路径获得绝对路径

                                        判断是否带*,是否为文件夹,是否为文件

         images/videos:      获取文件格式,判断图片格式是否包含在规定拓展名中

         nf:                         所有文件数

         count:                   文件中图片计数器,起索引作用

         img0:                    读入初始图

         s:                           字符串,表示输入的是第几张图片,用于后续打印

         img:                      调整图片尺寸(augmentations.py->letterbox)(需要32倍宽高)

         vid_cap(None):

         Convert:

augmentations.py(utils):

       letterbox:                                  (line91)

         r:                           长边缩放图片,(long/640)

         填充图片:

         if auto:                  若auto(默认true)为true,判断图片宽高是否为32倍数,若

满足直接读取

plots.py(utils):

       Annotator:                               (line68)

         初始化:

                     If-else:        默认opencv画框

                     box_label:   画框画标签

你可能感兴趣的:(python,opencv,深度学习,机器学习)