【深度学习】计算机视觉(CV)-目标检测-SSD(Single Shot MultiBox Detector)—— 单次检测多框检测器

SSD(Single Shot MultiBox Detector)—— 单次检测多框检测器

1️⃣ 什么是 SSD?

SSD (Single Shot MultiBox Detector) 是一种用于 目标检测(Object Detection)深度学习模型,由 Wei Liu 等人 在 2016 年提出。
它采用 单阶段(Single Stage) 方法,能够 直接从图像中检测多个对象,并输出类别和边界框,比传统的两阶段方法(如 Faster R-CNN)更快。


2️⃣ SSD 的核心特点

单阶段检测:相比 Faster R-CNN 需要两步(提取区域 + 识别),SSD 一步 就能完成目标检测。
多尺度特征检测:在 不同层级 进行检测,以适应大、小目标。
高效的先验框(Default Boxes):类似 YOLO 的锚框(Anchor Boxes),用于提高检测精度。
轻量级计算:比 Faster R-CNN 更快,适用于 实时检测


3️⃣ SSD 的网络结构

SSD 采用 VGG16MobileNet 作为骨干网络(Backbone),然后在 不同尺度的特征图上检测目标

SSD 结构分为三部分: 1️⃣ 主干网络(Backbone):通常是 VGG16 或 MobileNet,用于提取特征。
2️⃣ 多尺度检测层(Extra Feature Layers):在不同层进行检测,提高小目标的检测效果。
3️⃣ 预测层(Prediction Layers):利用 默认框(Default Boxes) 进行分类和回归。

SSD 典型架构

输入图像(300x300) ➝ VGG16 提取特征 ➝ 额外卷积层 ➝ 多尺度检测 ➝ 输出目标类别和边界框


4️⃣ SSD 的核心算法

1️⃣ 多尺度特征图(Feature Maps)

  • SSD 在 不同尺度 进行检测,例如:
    • conv4_3 层(大目标检测)
    • conv7 层(中等目标)
    • conv8_2 ~ conv11_2 层(小目标)
  • 这样能 同时检测不同尺寸的物体,提高检测精度。

2️⃣ 默认框(Default Boxes)

  • SSD 采用 多个尺寸和纵横比的默认框 进行检测。
  • 例如,一个位置可以有多个比例(1:1、1:2、2:1)和大小的框。
  • 通过非极大值抑制(NMS)筛选最优框

3️⃣ 损失函数(Loss Function) SSD 采用 多任务损失

L = L_{\text{loc}} + \alpha L_{\text{conf}}

  • 定位损失(L_loc):使用 Smooth L1 Loss 计算真实框和预测框的误差。
  • 分类损失(L_conf):使用 交叉熵(Cross Entropy) 进行类别预测。
  • 困难样本挖掘(Hard Negative Mining):平衡正负样本,防止负样本过多。

5️⃣ SSD 代码示例

使用 PyTorch 训练 SSD

import torch
import torchvision
from torchvision.models.detection import ssd300_vgg16

# 加载 SSD 预训练模型(VGG16 作为骨干网络)
model = ssd300_vgg16(pretrained=True)
model.eval()  # 设为评估模式

# 加载测试图像
image = torch.rand(1, 3, 300, 300)  # 生成一个随机图像
output = model(image)  # 进行目标检测

# 输出检测结果
print(output)

输出示例

[{'boxes': tensor([[  4.3774,   0.0000, 296.1398, 296.1545],
        [  4.3993,   0.0000, 296.4670, 296.7289],
        [  7.9937,   2.4237, 294.5887, 296.1728],
        [ 69.2036,   1.6595, 224.8485,  89.6344],
        [ 26.9926,   6.7602, 121.4106, 144.2272],
        [ 92.4211,   0.0000, 229.8040, 208.3294],
        [  1.3626,  23.1578,  93.5442, 289.2806],
        [  4.3993,   0.0000, 296.4670, 296.7289],
        [ 76.6926,   5.4309, 149.0640, 156.7170],
        [ 10.3550,   6.2502, 197.3316, 181.1919],
        [106.9824,   4.5797, 182.2237, 157.4160],
        [132.2069,   8.6678, 219.1386, 144.4542],
        [ 79.0658,  30.8220, 213.8745, 120.1073],
        [142.0560,  44.9816, 300.0000, 261.8794],
        [ 43.9961,  60.6780, 113.4670, 221.8410],
        [  4.8406,   3.4960, 173.2399,  85.3488],
        [168.6355,   3.2111, 246.1957, 157.3428],
        [115.9878,  18.3186, 190.0469,  94.4698],
        [  7.9937,   2.4237, 294.5887, 296.1728],
        [  1.8401,   2.7173,  80.9552,  81.3943],
        [ 84.2810,  18.5298, 157.4316,  93.6491],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [163.0305,  19.0193, 237.7007,  94.0890],
        [140.2142,   0.0000, 290.2731,  92.2141],
        [  0.7699,  84.9703,  99.9977, 201.1923],
        [ 20.7645,   7.6457,  58.4122,  72.7975],
        [ 49.2734,  18.8153, 125.7265,  94.6529],
        [ 37.5175,   8.6355,  74.5134,  72.0357],
        [ 49.3139,  70.5232, 175.3022, 209.0820],
        [206.2103,  51.5114, 283.0901, 233.4067],
        [ 54.2698,   9.4817,  89.9449,  71.6526],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [ 68.4321,  34.4477, 140.9630, 108.9126],
        [117.4278,  83.9086, 187.3821, 157.0109],
        [ 83.9005,  68.8995, 211.3797, 152.5351],
        [  4.3724,   7.1513,  41.6993,  74.3836],
        [ 11.2119,  66.5362, 142.1151, 153.6605],
        [176.0526,   4.2302, 259.2906,  77.8738],
        [ 16.5305,  32.5297,  93.1180, 111.5275],
        [172.7833,  59.0675, 240.8657, 224.4151],
        [ 70.6702,  26.6661, 105.3486,  87.4191],
        [ 86.4020,  84.3180, 155.8031, 156.8705],
        [  3.9825,  39.6654, 164.5469, 116.8739],
        [102.0017,  99.2526, 171.5490, 173.6638],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [218.2317,  99.7507, 245.6188, 175.6490],
        [ 12.5375, 109.5155, 139.7209, 252.5660],
        [148.9126,  85.0168, 219.4572, 156.0667],
        [143.6943,   0.0000, 208.8078,  92.9759],
        [218.0947, 131.8932, 245.6517, 206.9023],
        [ 86.6296,  27.1547, 121.8965,  86.9736],
        [182.1451,  26.4790, 218.2385,  87.4392],
        [ 85.6868,  51.3363, 156.9124, 124.8844],
        [201.7919,  20.3948, 230.0636,  95.3969],
        [ 58.6690,  85.6201,  86.4400, 158.0279],
        [237.8976,  79.2133, 299.4106, 267.3134],
        [ 74.1663,  86.0775, 101.9228, 158.3806],
        [ 38.3397,  42.0953,  73.8427, 104.3351],
        [118.3740,  26.3185, 154.1633,  86.7665],
        [165.9606,  26.4949, 202.2519,  87.1502],
        [102.2428,  26.6385, 138.0445,  86.6755],
        [116.5572,  50.8512, 188.6578, 125.3926],
        [133.5854,  99.5528, 203.4103, 173.3626],
        [ 41.6178,  85.4544,  70.0365, 157.4289],
        [ 89.9596,  85.9436, 117.8041, 158.6940],
        [ 34.2198,  84.6780, 108.1625, 157.1806],
        [ 58.2119,  36.3685,  86.1674, 111.0400],
        [134.3326,  26.1657, 170.1573,  87.0323],
        [153.1971,  20.0751, 182.6458,  94.6176],
        [105.8610,  85.7882, 134.1690, 158.7567],
        [130.4643,  39.4638, 294.5816, 115.5758],
        [233.8322,  99.0858, 261.7257, 177.0587],
        [  0.0000,  46.8298,  41.8673, 248.9998],
        [218.0107,  20.5122, 245.8064,  95.4748],
        [233.4248, 131.2962, 261.9585, 207.5430],
        [ 22.0550,  41.2268,  57.4551, 104.8794],
        [121.7302,  85.4249, 150.3270, 158.4747],
        [  6.2010, 154.6621,  82.9141, 299.6869],
        [202.2049,  84.8620, 229.4331, 158.9373],
        [147.7037,  51.0245, 220.5701, 124.3972],
        [ 89.8940,  52.5901, 117.6983, 126.6781],
        [ 28.1485,   2.2970,  83.0982,  48.7126],
        [ 52.0133,  99.9635, 123.1805, 173.4247],
        [ 34.3082,  50.2485, 108.2714, 125.6584],
        [212.5328,  97.9328, 284.7024, 174.2638],
        [202.5374,   1.6573, 289.3676, 161.6270],
        [ 74.0387,  52.5599, 101.7447, 126.6867],
        [ 49.3417,  93.9316, 277.1458, 227.4363],
        [117.9579, 115.3029, 186.7402, 189.8200],
        [ 42.2042, 127.9100, 114.5919, 287.4723],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [  4.3993,   0.0000, 296.4670, 296.7289],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [ 92.4211,   0.0000, 229.8040, 208.3294],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [ 71.9093, 171.6278, 221.8463, 295.5115],
        [  7.9937,   2.4237, 294.5887, 296.1728],
        [  4.3774,   0.0000, 296.1398, 296.1545]], grad_fn=), 'scores': tensor([0.0638, 0.0606, 0.0548, 0.0468, 0.0463, 0.0453, 0.0450, 0.0424, 0.0402,
        0.0398, 0.0373, 0.0369, 0.0350, 0.0349, 0.0331, 0.0331, 0.0331, 0.0324,
        0.0323, 0.0315, 0.0314, 0.0308, 0.0295, 0.0286, 0.0282, 0.0276, 0.0271,
        0.0269, 0.0257, 0.0249, 0.0247, 0.0247, 0.0247, 0.0245, 0.0241, 0.0238,
        0.0237, 0.0235, 0.0234, 0.0232, 0.0227, 0.0226, 0.0225, 0.0223, 0.0222,
        0.0222, 0.0221, 0.0219, 0.0219, 0.0218, 0.0218, 0.0218, 0.0218, 0.0217,
        0.0216, 0.0214, 0.0213, 0.0213, 0.0213, 0.0212, 0.0211, 0.0210, 0.0209,
        0.0209, 0.0209, 0.0209, 0.0208, 0.0208, 0.0206, 0.0206, 0.0206, 0.0205,
        0.0203, 0.0202, 0.0202, 0.0202, 0.0202, 0.0202, 0.0200, 0.0199, 0.0195,
        0.0194, 0.0194, 0.0193, 0.0193, 0.0193, 0.0193, 0.0192, 0.0192, 0.0192,
        0.0192, 0.0173, 0.0163, 0.0130, 0.0119, 0.0119, 0.0112, 0.0111, 0.0111,
        0.0110], grad_fn=), 'labels': tensor([61,  1, 28,  1,  1,  1,  1, 65,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
        38,  1,  1,  5,  1,  1,  1,  1,  1,  1,  1,  1,  1, 52,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1, 16,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
        84,  9, 32, 52, 67, 41, 36,  5, 35, 19])}]

使用 OpenCV 进行目标检测

import cv2
import numpy as np
import torch
from torchvision.models.detection import ssd300_vgg16, SSD300_VGG16_Weights

# 使用绝对路径
image_path = r"D:\Pictures\test.jpg"

# 读取图像
image = cv2.imread(image_path)
image = cv2.resize(image, (300, 300))
image_tensor = torch.from_numpy(image.transpose(2, 0, 1)).float().unsqueeze(0)

# 加载模型
model = ssd300_vgg16(weights=SSD300_VGG16_Weights.DEFAULT)
model.eval()

# 进行预测
output = model(image_tensor)

# 解析检测结果
for box, score in zip(output[0]['boxes'], output[0]['scores']):
    if score > 0.5:  # 设定置信度阈值
        x1, y1, x2, y2 = map(int, box.tolist())
        cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)

# 显示检测结果
cv2.imshow("SSD Detection", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

【深度学习】计算机视觉(CV)-目标检测-SSD(Single Shot MultiBox Detector)—— 单次检测多框检测器_第1张图片 


6️⃣ SSD vs 其他目标检测算法
模型 类型 速度(FPS) 检测精度(mAP) 优点 缺点
SSD 单阶段 45+ 74.3 速度快,多尺度检测 小目标精度较低
YOLO 单阶段 60+ 63.4 速度极快 细节检测能力较差
Faster R-CNN 双阶段 5-10 76.4 高精度 速度较慢

7️⃣ SSD 的应用

自动驾驶(Autonomous Driving)
人脸检测(Face Detection)
视频监控(Surveillance)
工业检测(Industrial Inspection)
智能安防(Smart Security)


8️⃣ SSD 的优化方向

改进骨干网络(如 ResNet、MobileNet),提升特征提取能力。
结合 Transformer(如 DETR),增强全局信息建模。
提高小目标检测能力(如 FPN、注意力机制)


总结

SSD 是一种单阶段目标检测方法,速度快,适合实时检测。
SSD 采用多尺度特征图和默认框,提高检测精度。
相比 Faster R-CNN,SSD 速度更快,但小目标检测性能稍弱
广泛应用于自动驾驶、人脸检测、工业检测等领域

SSD 结合 YOLO 的高效性和 Faster R-CNN 的精度,使其成为实时目标检测的优秀选择!

你可能感兴趣的:(深度学习,人工智能,计算机视觉,深度学习,目标检测)