北漠苍狼1746430162

自动驾驶初体验之KITTI数据集----基于SSD框架的全过程

前言

自动驾驶的时代已然来临了，各个国家和有实力的研究机构争先恐后的开始了各种自动驾驶的研究与测试。我国的一线城市也相继出台了相关的政策法规来规范自动驾驶的进程。据我了解，我们国家做自动驾驶的企业以北京最多，其次是深圳与上海。大家都知道北京是IT产业的重镇，也盘踞着各类型的IT巨鳄。有的是想要做自动驾驶的生态系统，有的则想要在自动驾驶的某一专业领域深耕。真所谓八仙过海各显神通！

百度开源了Apollo自动驾驶框架，着力打造自动驾驶生态系统。这里面包括激光雷达、毫米波雷达、视觉、高精度地图等等模块。Apollo对于相应的地图、激光点云数据也进行了部份开源。我也花了一些时间研究过这个框架，感觉整个体系非常的庞大，非常值得借鉴。但是，百度的这一框架主要针对的是有一定实力，同时也愿意加入整个Apollo生态系统的企业。然而，对于个人用户来说构建和运行如此庞大的自动驾驶系统，在人力与财力上肯定捉襟见肘。为此，作为一个在嵌入式、电子电路、自动控制、人工智能与模式识别方向上有一定积累的研发人员。除了在本人的博客中淺显的分析一下百度Apollo系统，与此同时更愿意在自动驾驶开源数据集KITTI上验证自己代码的可行性。毫不讳言的说，确实是自己没有足够的人力与财力来实测自动驾驶。所以，如果有朋友在自动驾驶这一方面感兴趣，同时与我一样苦于条件不足。那么请和我一起以最经济的方式开启自动驾驶之旅吧！！！

电脑配置

品牌：惠普（HP）
商品名称：惠普暗影精灵III代
商品编号：6372549
商品毛重：4.9kg
商品产地：中国大陆
货号：3KS71PA
系统：Windows 10
分辨率：全高清屏（1920×1080）
厚度：20.0mm以上
内存容量：8G
显卡型号：GTX1060
游戏性能：吃鸡性能，发烧级，骨灰级
待机时长：5-7小时
处理器：Intel i7标准电压版
特性：背光键盘，其他
系列：惠普-暗影精灵
裸机重量：大于2.5KG
显卡类别：高性能游戏独立显卡 GTX1060
显存容量：6G
硬盘容量：128G+1T
屏幕尺寸：15.6英寸

准备工作

KITTI数据集由德国卡尔斯鲁厄理工学院和丰田美国技术研究院联合创办，是目前国际上最大的自动驾驶场景下的计算机视觉算法评测数据集。用于评测目标（机动车、非机动车、行人等）检测、目标跟踪、路面分割等计算机视觉技术在车载环境下的性能。KITTI包含市区、乡村和高速公路等场景采集的真实图像数据，每张图像中最多达15辆车和30个行人，还有各种程度的遮挡（ps:欧洲道路状况和中国还是很不相同，期待国内早日能有同类数据集）。

进入官网，找到object一栏，准备下载数据集：

进行SSD训练只需要下载第1个图片集 Download left color images of object data set (12 GB)和标注文件 Download training labels of object data set (5 MB) 就够了。然后将其解压，发现其中7481张训练图片有标注信息，而测试图片没有，这就是本次训练所使用的图片数量。由于SSD中训练脚本是基于VOC数据集格式的，所以我们需要把KITTI数据集做成PASCAL VOC的格式，其基本架构可以参看这篇博客：PASCAL VOC数据集分析。根据SSD训练要求，我在/home/zy/data/中目录中建立一系列文件夹存放所需数据集和工具文件，相应的路径与请参见我给出的图片，如果有什么不懂的可以给我留言。具体如下：

在整个KITTI文件夹中，它们的层次关系如下：

# 在data/文件夹下新建KITTIdevkit/KITTI两层子目录，所需文件放在KITTI/中
Annotations/
└── 000000.xml 
ImageSets/
└── main/
      └── trainval.txt
      └── test.txt # 等等
JPEGImages/
└── 000000.png
Labels/
└── 000000.txt # 自建文件夹，存放原始标注信息，待转化为xml，不属于VOC格式
create_train_test_txt.py # 3个python工具，后面有详细介绍
modify_annotations_txt.py
txt_to_xml.py

KITTI标注信息说明

KITTI标注信息是放在data_object_label.zip中的，具体的格式如下图所示：（请大家仔细看看我的文件路径）

Car 0.00 0 1.96 280.38 185.10 344.90 215.59 1.49 1.76 4.01 -15.71 2.16 38.26 1.57
Car 0.00 0 1.88 365.14 184.54 406.11 205.20 1.38 1.80 3.41 -15.89 2.23 51.17 1.58
DontCare -1 -1 -10 402.27 166.69 477.31 197.98 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 518.53 177.31 531.51 187.17 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 1207.50 233.35 1240.00 333.39 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 535.06 177.65 545.26 185.82 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 558.03 177.88 567.50 184.65 -1 -1 -1 -1000 -1000 -1000 -10

那么上面的数据格式是什么意思呢，我们需要很明确的知晓后，才能更好的训练模型，具体的含义如下：

下面是官方的解释文档
Data Format Description
=======================
The data for training and testing can be found in the corresponding folders.
 The sub-folders are structured as follows:

  - image_02/ contains the left color camera images (png)
  - label_02/ contains the left color camera label files (plain text files)
  - calib/ contains the calibration for all four cameras (plain text file)

The label files contain the following information, which can be read and
written using the matlab tools (readLabels.m, writeLabels.m) provided within
this devkit. All values (numerical or strings) are separated via spaces,
each row corresponds to one object. The 15 columns represent:

#Values    Name      Description
----------------------------------------------------------------------------
   1    type         Describes the type of object: 'Car', 'Van', 'Truck',
                     'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram',
                     'Misc' or 'DontCare'
   1    truncated    Float from 0 (non-truncated) to 1 (truncated), where
                     truncated refers to the object leaving image boundaries
   1    occluded     Integer (0,1,2,3) indicating occlusion state:
                     0 = fully visible, 1 = partly occluded
                     2 = largely occluded, 3 = unknown
   1    alpha        Observation angle of object, ranging [-pi..pi]
   4    bbox         2D bounding box of object in the image (0-based index):
                     contains left, top, right, bottom pixel coordinates
   3    dimensions   3D object dimensions: height, width, length (in meters)
   3    location     3D object location x,y,z in camera coordinates (in meters)
   1    rotation_y   Rotation ry around Y-axis in camera coordinates [-pi..pi]
   1    score        Only for results: Float, indicating confidence in
                     detection, needed for p/r curves, higher is better.

Here, 'DontCare' labels denote regions in which objects have not been labeled,
for example because they have been too far away from the laser scanner. To
prevent such objects from being counted as false positives our evaluation
script will ignore objects detected in don't care regions of the test set.
You can use the don't care labels in the training set to avoid that your object
detector is harvesting hard negatives from those areas, in case you consider
non-object regions from the training images as negative examples.

The coordinates in the camera coordinate system can be projected in the image
by using the 3x4 projection matrix in the calib folder, where for the left
color camera for which the images are provided, P2 must be used. The
difference between rotation_y and alpha is, that rotation_y is directly
given in camera coordinates, while alpha also considers the vector from the
camera center to the object center, to compute the relative orientation of
the object with respect to the camera. For example, a car which is facing
along the X-axis of the camera coordinate system corresponds to rotation_y=0,
no matter where it is located in the X/Z plane (bird's eye view), while
alpha is zero only, when this object is located along the Z-axis of the
camera. When moving the car away from the Z-axis, the observation angle
will change.

To project a point from Velodyne coordinates into the left color image,
you can use this formula: x = P2 * R0_rect * Tr_velo_to_cam * y
For the right color image: x = P3 * R0_rect * Tr_velo_to_cam * y

Note: All matrices are stored row-major, i.e., the first values correspond
to the first row. R0_rect contains a 3x3 matrix which you need to extend to
a 4x4 matrix by adding a 1 as the bottom-right element and 0's elsewhere.
Tr_xxx is a 3x4 matrix (R|t), which you need to extend to a 4x4 matrix 
in the same way!

Note, that while all this information is available for the training data,
only the data which is actually needed for the particular benchmark must
be provided to the evaluation server. However, all 15 values must be provided
at all times, with the unused ones set to their default values (=invalid) as
specified in writeLabels.m. Additionally a 16'th value must be provided
with a floating value of the score for a particular detection, where higher
indicates higher confidence in the detection. The range of your scores will
be automatically determined by our evaluation server, you don't have to
normalize it, but it should be roughly linear. If you use writeLabels.m for
writing your results, this function will take care of storing all required
data correctly.

中文解释：（本人zouyu）

数据格式说明
=======================

用于培训和测试的数据可以在相应的文件夹中找到。
子文件夹的结构如下：

  -  image_02 /包含左侧彩色相机图像（png）
  -  label_02 /包含左侧彩色相机标签文件（纯文本文件）
  - 校准/包含所有四台摄像机的校准（纯文本文件）

标签文件包含以下信息，可以读取和
使用在其中提供的matlab工具（readLabels.m，writeLabels.m）编写
这个开发包。所有的值（数字或字符串）都是通过空格分隔的，
每一行对应一个对象。15列代表：

#Values名称说明
-------------------------------------------------- --------------------------
   1类型描述对象的类型：'Car'，'Van'，'Truck'，
                     '行人'，'人员配备'，'骑单车'，'电车'，
                     'Misc'或'DontCare'
   1将Float从0（非截断）截断为1（截断），其中
                     截断是指对象离开图像边界
   1 occluded整数（0,1,2,3）表示遮挡状态：
                     0 =完全可见，1 =部分遮挡
                     2 =大部分被遮挡，3 =未知
   1 alpha对象的观察角度，范围[-pi..pi]
   4 bbox图像中对象的二维边界框（基于0的索引）：
                     包含左，上，右，底像素坐标
   3维3D对象尺寸：高度，宽度，长度（以米为单位）
   3位置三维物体位置x，y，z相机坐标（以米为单位）
   1 rotation_y在相机坐标中围绕Y轴旋转ry [-pi..pi]
   1分仅用于结果：浮动，表示对自己的信心
                     检测，对于p / r曲线来说，越高越好。

在这里，'DontCare'标签表示对象未被标记的区域，
例如因为它们离激光扫描仪太远。至
防止此类对象被视为我们评估的误报
脚本将忽略在不关心测试集的区域中检测到的对象。
您可以使用训练集中的无关标签来避免您的对象
如果你考虑的话，探测器正在从这些地区收获难以消除的负面影响
来自训练图像的非对象区域作为反面例子。

相机坐标系中的坐标可以投影到图像中
通过使用校准文件夹中的3x4投影矩阵，在左侧
为其提供图像的彩色照相机，必须使用P2。该
rotation_y和alpha之间的区别在于，rotation_y是直接的
在摄像机坐标中给出，而阿尔法也考虑来自摄像机的矢量
摄像机中心到物体中心，计算相对方位
该物体相对于相机。例如，一辆正在面对的汽车
沿照相机坐标系的X轴对应于rotation_y = 0，
无论它位于X / Z平面（鸟瞰图）的哪个位置，同时
只有当这个物体沿着Z轴方向定位时，alpha才为零
相机。从Z轴移动汽车时，观察角度
会改变。

要将Velodyne坐标中的点投影到左侧的彩色图像中，
你可以使用这个公式：x = P2 * R0_rect * Tr_velo_to_cam * y
对于正确的彩色图像：x = P3 * R0_rect * Tr_velo_to_cam * y

注意：所有矩阵都存储在主行中，即第一个值对应
到第一行。R0_rect包含一个需要扩展到的3x3矩阵
一个4x4的矩阵，在右下角加上1作为0，在其他地方加0。
Tr_xxx是一个3x4矩阵（R | t），您需要将其扩展为4x4矩阵 
以同样的方式！

请注意，虽然所有这些信息都可用于培训数据，
只有特定基准实际需要的数据必须
提供给评估服务器。但是，必须提供所有15个值
在任何时候，未使用的设置为默认值（=无效）为
在writeLabels.m中指定。另外必须提供第16个值
对于特定检测的分数浮动值更高
表明对检测的信心更高。你的分数的范围将会
由我们的评估服务器自动确定，您不必
规范化它，但它应该大致是线性的。如果你使用writeLabels.m
编写你的结果，这个功能将照顾所有需要的存储
数据正确。

转换KITTI类别

PASCAL VOC数据集总共20个类别，如果用于特定场景，20个类别确实多了。本文数据集设置3个类别， ‘Car’‘Cyclist’‘Pedestrian’，只不过标注信息中还有其他类型的车和人，直接略过有点浪费，将 ‘Van’, ‘Truck’, ‘Tram’ 合并到 ‘Car’ 类别中去，将 ‘Person_sitting’ 合并到 ‘Pedestrian’ 类别中去（‘Misc’ 和 ‘Dontcare’ 这两类直接忽略）。这里使用的是modify_annotations_txt.py工具，源码如下：

# coding=UTF-8
import glob
import string

txt_list = glob.glob('./Labels/*.txt') # 存储Labels文件夹所有txt文件路径
def show_category(txt_list):
    category_list= []
    for item in txt_list:
        try:
            with open(item) as tdf:
                for each_line in tdf:
                    labeldata = each_line.strip().split(' ') # 去掉前后多余的字符并把其分开
                    category_list.append(labeldata[0]) # 只要第一个字段，即类别
        except IOError as ioerr:
            print('File error:'+str(ioerr))
    print(set(category_list)) # 输出集合

def merge(line):
    each_line=''
    for i in range(len(line)):
        if i!= (len(line)-1):
            each_line=each_line+line[i]+' '
        else:
            each_line=each_line+line[i] # 最后一条字段后面不加空格
    each_line=each_line+'\n'
    return (each_line)

print('before modify categories are:\n')
show_category(txt_list)

for item in txt_list:
    new_txt=[]
    try:
        with open(item, 'r') as r_tdf:
            for each_line in r_tdf:
                labeldata = each_line.strip().split(' ')
                if labeldata[0] in ['Truck','Van','Tram']: # 合并汽车类
                    labeldata[0] = labeldata[0].replace(labeldata[0],'Car')
                if labeldata[0] == 'Person_sitting': # 合并行人类
                    labeldata[0] = labeldata[0].replace(labeldata[0],'Pedestrian')
                if labeldata[0] == 'DontCare': # 忽略Dontcare类
                    continue
                if labeldata[0] == 'Misc': # 忽略Misc类
                    continue
                new_txt.append(merge(labeldata)) # 重新写入新的txt文件
        with open(item,'w+') as w_tdf: # w+是打开原文件将内容删除，另写新内容进去
            for temp in new_txt:
                w_tdf.write(temp)
    except IOError as ioerr:
        print('File error:'+str(ioerr))

print('\nafter modify categories are:\n')
show_category(txt_list)

通过运行上面的modify_annotations_txt.py文件后会在Labels文件夹下生成相应的.txt文件，具体如下所示：

执行命令python modify_annotations_txt.py 来运行py程序，这里以000400.txt为例，显示转换前后的对比效果：

# 转换前
Car 0.00 0 -1.67 642.24 178.50 680.14 208.68 1.38 1.49 3.32 2.41 1.66 34.98 -1.60
Car 0.00 0 -1.75 685.77 178.12 767.02 235.21 1.50 1.62 3.89 3.27 1.67 21.18 -1.60
Car 0.67 0 -2.15 885.80 160.44 1241.00 374.00 1.69 1.58 3.95 3.64 1.65 5.47 -1.59
Truck 0.00 0 -1.89 755.82 101.65 918.16 230.75 3.55 2.56 7.97 7.06 1.63 23.91 -1.61
Car 0.00 1 -2.73 928.61 177.14 1016.83 209.77 1.48 1.36 3.51 17.33 1.71 34.63 -2.27
DontCare -1 -1 -10 541.27 169.15 551.15 175.57 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 616.21 176.35 636.76 189.44 -1 -1 -1 -1000 -1000 -1000 -10

# 转换后
Car 0.00 0 -1.67 642.24 178.50 680.14 208.68 1.38 1.49 3.32 2.41 1.66 34.98 -1.60
Car 0.00 0 -1.75 685.77 178.12 767.02 235.21 1.50 1.62 3.89 3.27 1.67 21.18 -1.60
Car 0.67 0 -2.15 885.80 160.44 1241.00 374.00 1.69 1.58 3.95 3.64 1.65 5.47 -1.59
Car 0.00 0 -1.89 755.82 101.65 918.16 230.75 3.55 2.56 7.97 7.06 1.63 23.91 -1.61
Car 0.00 1 -2.73 928.61 177.14 1016.83 209.77 1.48 1.36 3.51 17.33 1.71 34.63 -2.27

转txt为xml

对原始txt文件进行上述处理后，接下来需要将标注文件从txt转化为xml，并去掉标注信息中用不上的部分，只留下3类，还有把坐标值从float型转化为int型，最后所有生成的xml文件要存放在Annotations文件夹中。这里使用的是txt_to_xml.py工具，此处是由 KITTI_SSD 的代码修改而来，感谢作者的贡献。

# txt_to_xml.py
# encoding:utf-8
# 根据一个给定的XML Schema，使用DOM树的形式从空白文件生成一个XML
from xml.dom.minidom import Document
import cv2
import os

def generate_xml(name,split_lines,img_size,class_ind):
    doc = Document()  # 创建DOM文档对象

    annotation = doc.createElement('annotation')
    doc.appendChild(annotation)

    title = doc.createElement('folder')
    title_text = doc.createTextNode('KITTI')
    title.appendChild(title_text)
    annotation.appendChild(title)

    img_name=name+'.png'

    title = doc.createElement('filename')
    title_text = doc.createTextNode(img_name)
    title.appendChild(title_text)
    annotation.appendChild(title)

    source = doc.createElement('source')
    annotation.appendChild(source)

    title = doc.createElement('database')
    title_text = doc.createTextNode('The KITTI Database')
    title.appendChild(title_text)
    source.appendChild(title)

    title = doc.createElement('annotation')
    title_text = doc.createTextNode('KITTI')
    title.appendChild(title_text)
    source.appendChild(title)

    size = doc.createElement('size')
    annotation.appendChild(size)

    title = doc.createElement('width')
    title_text = doc.createTextNode(str(img_size[1]))
    title.appendChild(title_text)
    size.appendChild(title)

    title = doc.createElement('height')
    title_text = doc.createTextNode(str(img_size[0]))
    title.appendChild(title_text)
    size.appendChild(title)

    title = doc.createElement('depth')
    title_text = doc.createTextNode(str(img_size[2]))
    title.appendChild(title_text)
    size.appendChild(title)

    for split_line in split_lines:
        line=split_line.strip().split()
        if line[0] in class_ind:
            object = doc.createElement('object')
            annotation.appendChild(object)

            title = doc.createElement('name')
            title_text = doc.createTextNode(line[0])
            title.appendChild(title_text)
            object.appendChild(title)

            bndbox = doc.createElement('bndbox')
            object.appendChild(bndbox)
            title = doc.createElement('xmin')
            title_text = doc.createTextNode(str(int(float(line[4]))))
            title.appendChild(title_text)
            bndbox.appendChild(title)
            title = doc.createElement('ymin')
            title_text = doc.createTextNode(str(int(float(line[5]))))
            title.appendChild(title_text)
            bndbox.appendChild(title)
            title = doc.createElement('xmax')
            title_text = doc.createTextNode(str(int(float(line[6]))))
            title.appendChild(title_text)
            bndbox.appendChild(title)
            title = doc.createElement('ymax')
            title_text = doc.createTextNode(str(int(float(line[7]))))
            title.appendChild(title_text)
            bndbox.appendChild(title)

    # 将DOM对象doc写入文件
    f = open('Annotations/'+name+'.xml','w')
    f.write(doc.toprettyxml(indent = ''))
    f.close()

if __name__ == '__main__':
    class_ind=('Pedestrian', 'Car', 'Cyclist')
    cur_dir=os.getcwd()
    labels_dir=os.path.join(cur_dir,'Labels')
    for parent, dirnames, filenames in os.walk(labels_dir): # 分别得到根目录，子目录和根目录下文件   
        for file_name in filenames:
            full_path=os.path.join(parent, file_name) # 获取文件全路径
            f=open(full_path)
            split_lines = f.readlines()
            name= file_name[:-4] # 后四位是扩展名.txt，只取前面的文件名
            img_name=name+'.png' 
            img_path=os.path.join('/home/zy/data/KITTIdevkit/KITTI/train_image',img_name) # 路径需要自行修改            
            img_size=cv2.imread(img_path).shape
            generate_xml(name,split_lines,img_size,class_ind)
print('all txts has converted into xmls')

执行命令python txt_to_xml.py 来运行py程序，转换效果如下：

# 原始的000400.txt
Car 0.00 0 -1.67 642.24 178.50 680.14 208.68 1.38 1.49 3.32 2.41 1.66 34.98 -1.60
Car 0.00 0 -1.75 685.77 178.12 767.02 235.21 1.50 1.62 3.89 3.27 1.67 21.18 -1.60
Car 0.67 0 -2.15 885.80 160.44 1241.00 374.00 1.69 1.58 3.95 3.64 1.65 5.47 -1.59
Car 0.00 0 -1.89 755.82 101.65 918.16 230.75 3.55 2.56 7.97 7.06 1.63 23.91 -1.61
Car 0.00 1 -2.73 928.61 177.14 1016.83 209.77 1.48 1.36 3.51 17.33 1.71 34.63 -2.27
# 生成的000400.xml（部分）
This XML file does not appear to have any style information associated with it. The document tree is shown below.
<annotation>
  <folder>KITTIfolder>
  <filename>000400.pngfilename>#KITTI的数据集中，有000400.png图片
  <source>
    <database>The KITTI Databasedatabase>
    <annotation>KITTIannotation>
  source>
  <size>
    <width>1242width>#图片长宽像素值，RGB三通道
    <height>375height>
    <depth>3depth>
  size>
  <object>
    <name>Carname>#汽车
    <bndbox>
      <xmin>642xmin>
      <ymin>178ymin>#左上角、右下角的坐标值
      <xmax>680xmax>
      <ymax>208ymax>
    bndbox>
  object>
  <object>
    <name>Carname>
    <bndbox>
      <xmin>685xmin>
      <ymin>178ymin>
      <xmax>767xmax>
      <ymax>235ymax>
    bndbox>
  object>
......
annotation>

在000400.png图片中，有一辆红色的小汽车，它的坐标值左上角为：（642，178），右下角为：（680，208）。上面的XML文件就是把图片中的汽车全部标出来了。这样用神经网络来做卷积的时候，我们就知道卷哪里（话说的有点像白话，但是可以帮人理解整个过程）。那么，我们要把训练集中的汽车全部标注出来。000400.png中的汽车如下面图片所示，给大家一个直观的感受。（ZOUYU）

生成训练验证集和测试集列表

用于SSD训练的Pascal VOC格式的数据集总共就是三大块：首先是JPEGImages文件夹，放入了所有png图片；然后是Annotations文件夹，上述步骤已经生成了相应的xml文件；最后就是imagesSets文件夹，里面有一个Main子文件夹，这个文件夹存放的是训练验证集，测试集的相关列表文件，如下图所示：

这里使用create_train_test_txt.py工具，自动生成上述16个txt文件，其中训练测试部分的比例可以自行修改，由于这个工具是用Python3写的，所以执行的时候应该是：python3 create_train_test_txt.py

# create_train_test_txt.py
# encoding:utf-8
import pdb
import glob
import os
import random
import math

def get_sample_value(txt_name, category_name):
    label_path = './Labels/'
    txt_path = label_path + txt_name+'.txt'
    try:
        with open(txt_path) as r_tdf:
            if category_name in r_tdf.read():
                return ' 1'
            else:
                return '-1'
    except IOError as ioerr:
        print('File error:'+str(ioerr))

txt_list_path = glob.glob('./Labels/*.txt')
txt_list = []

for item in txt_list_path:
    temp1,temp2 = os.path.splitext(os.path.basename(item))
    txt_list.append(temp1)
txt_list.sort()
print(txt_list, end = '\n\n')

# 有博客建议train:val:test=8:1:1，先尝试用一下
num_trainval = random.sample(txt_list, math.floor(len(txt_list)*9/10.0)) # 可修改百分比
num_trainval.sort()
print(num_trainval, end = '\n\n')

num_train = random.sample(num_trainval,math.floor(len(num_trainval)*8/9.0)) # 可修改百分比
num_train.sort()
print(num_train, end = '\n\n')

num_val = list(set(num_trainval).difference(set(num_train)))
num_val.sort()
print(num_val, end = '\n\n')

num_test = list(set(txt_list).difference(set(num_trainval)))
num_test.sort()
print(num_test, end = '\n\n')

pdb.set_trace()

Main_path = './ImageSets/Main/'
train_test_name = ['trainval','train','val','test']
category_name = ['Car','Pedestrian','Cyclist']

# 循环写trainvl train val test
for item_train_test_name in train_test_name:
    list_name = 'num_'
    list_name += item_train_test_name
    train_test_txt_name = Main_path + item_train_test_name + '.txt' 
    try:
        # 写单个文件
        with open(train_test_txt_name, 'w') as w_tdf:
            # 一行一行写
            for item in eval(list_name):
                w_tdf.write(item+'\n')
        # 循环写Car Pedestrian Cyclist
        for item_category_name in category_name:
            category_txt_name = Main_path + item_category_name + '_' + item_train_test_name + '.txt'
            with open(category_txt_name, 'w') as w_tdf:
                # 一行一行写
                for item in eval(list_name):
                    w_tdf.write(item+' '+ get_sample_value(item, item_category_name)+'\n')
    except IOError as ioerr:
        print('File error:'+str(ioerr))

执行程序过程中，如遇到pdb提示，可按c键，再按enter键。如果想把标注数据全部作为trainval，而把未标注的数据（大约有7000多图片）作为test，需要重新修改脚本。

数据集的后续处理

下面进行数据集的后续处理，在/home/zy/data/KITTIdevkit/之下新建KITTI文件夹，用于存放本次训练所需的脚本工具，如下图所示。

生成训练所需列表文件

SSD训练的时候除了需要LMDB格式的数据以外，还需要读取三个列表文件，分别是：trainval.txt，test.txt和test_name_size.txt。前两个txt文件存放训练、测试图片的png路径和xml路径，第三个txt文件存放测试图片的名称和尺寸。所需工具可以由caffe/data/VOC0712/create_list.sh脚本修改而来。这里的caffe就是你自己的工程路径。

复制一份上述脚本，并重命名为create_list_kitti.sh，存放在KITTI文件夹中。经过修改后的脚本文件如下

# create_list_kitti.sh
#!/bin/bash
root_dir=/home/zy/data/KITTIdevkit ## 自行修改
sub_dir=ImageSets/Main
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
printf" bash_dir"
for dataset in trainval test
do
  dst_file=$bash_dir/$dataset.txt
  if [ -f $dst_file ]
  then
    rm -f $dst_file
  fi
  for name in KITTI ## 自行修改
  do
    #if [[ $dataset == "test" && $name == "VOC2012" ]] ## 这段可以注释掉
    #then
        #continue
    #fi
    echo "Create list for $name $dataset..."
    dataset_file=$root_dir/$name/$sub_dir/$dataset.txt

    img_file=$bash_dir/$dataset"_img.txt"
    cp $dataset_file $img_file
    sed -i "s/^/$name\/JPEGImages\//g" $img_file
    sed -i "s/$/.png/g" $img_file ## 从jpg改为png

    label_file=$bash_dir/$dataset"_label.txt"
    cp $dataset_file $label_file
    sed -i "s/^/$name\/Annotations\//g" $label_file
    sed -i "s/$/.xml/g" $label_file

    paste -d' ' $img_file $label_file >> $dst_file

    rm -f $label_file
    rm -f $img_file
  done

  # Generate image name and size infomation.
  if [ $dataset == "test" ]
  then
    $bash_dir  $root_dir $dst_file $bash_dir/$dataset"_name_size.txt"
  fi

  # Shuffle trainval file.
  if [ $dataset == "trainval" ]
  then
    rand_file=$dst_file.random
    cat $dst_file | perl -MList::Util=shuffle -e 'print shuffle();' > $rand_file
    mv $rand_file $dst_file
  fi
done

执行下面命令，可在/home/zy/data/KITTIdevkit/KITTI文件夹下生成3个训练所需txt文件。

$ cd ~/caffe
$ ./data/KITTI/create_list_kitti.sh #这里的路径根据自己的实际文件来执行

而生成的txt列表格式如下：

# trainval.txt和test.txt文件格式
KITTI/JPEGImages/000003.png KITTI/Annotations/000003.xml
KITTI/JPEGImages/000136.png KITTI/Annotations/000136.xml
KITTI/JPEGImages/000022.png KITTI/Annotations/000022.xml
KITTI/JPEGImages/000151.png KITTI/Annotations/000151.xml
......

# test_name_size.txt文件格式
000011 375 1242
000012 375 1242
000035 375 1242
000044 375 1242
......1

准备标签映射文件

由于只有3类，所以可以仿照例子，写一个labelmap_kitti.prototxt文件，用于记录label和name的对应关系，存放在/home/zy/data/KITTIdevkit/KITTI文件夹中，具体内容如下：

item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
}
item {
  name: "Car"
  label: 1
  display_name: "Car"
}
item {
  name: "Pedestrian"
  label: 2
  display_name: "Pedestrian"
}
item {
  name: "Cyclist"
  label: 3
  display_name: "Cyclist"
}

生成LMDB数据库

如果前面一切顺利，现在就可以生成LMDB文件了，所需工具可以由/home/zy/data/KITTIdevkit/KITTI/create_data.sh脚本修改而来。仍然复制一份上述脚本，并重命名为create_data_kitti.sh，存放在KITTI文件夹中。经过修改后的脚本文件如下：

# create_data_kitti.sh
cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir=$cur_dir

cd $root_dir

redo=1
data_root_dir="/home/zy/data/KITTIdevkit" ## 自行修改
dataset_name="KITTI" ## 自行修改
mapfile="/home/zy/data/KITTIdevkit/KITTI/labelmap_kitti.prototxt" ## 自行修改
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
width=0
height=0

extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
then
  extra_cmd="$extra_cmd --redo"
fi
for subset in test trainval
do
  python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $root_dir/$subset.txt $data_root_dir/$dataset_name/$db/$dataset_name"_"$subset"_"$db examples/$dataset_name
done #上面的这个scripts/create_annoset.py文件是SSD的CAFFE工程中的一个文件，这个非常重要。
#我是把它拷到了这个/home/zy/data/KITTIdevkit文件中来运行的。

执行命令./data/KITTI/create_data_kitti.sh 来运行脚本，将会生成两份LMDB文件，路径分别如下

$ /home/zy/data/KITTIdevkit/KITTI/lmdb/KITTI_test_lmdb
$ /home/zy/data/KITTIdevkit/KITTI/lmdb/KITTI_trainval_lmdb

至此，训练数据可以说已经准备好了。

下载VGG预训练模型

将 SSD 用于自己的检测任务，是需要 Fine-tuning a pretrained network，看过论文的朋友可能都知道，论文中的SSD框架是是由VGG网络为基底（base）的。除此之外，作者也提供了另外两种结构的网络：ZF-SSD和Resnet-SSD，可以在caffe/examples/ssd 文件夹中查看相应的python训练代码。

初次训练，还是用VGG网络吧，下载该预训练模型，将其放到/home/zy/zouyu/deeping/caffe-ssd/models/VGGNet文件夹之下。

修改训练代码

一般的caffe训练都是使用train.prototxt和solver.prototxt文件，有同学可能想问了，为什么SSD项目下找不到这些文件呢？原因是SSD的模型很大，train.prototxt就有1000多行，直接修改参数的工作量太大，而且train.prototxt一旦改动，test.prototxt和solver.prototxt也要跟着改动。因此，作者使用了一个很有效的方法，利用python脚本，自动生成这些文件。初次训练，选择了ssd_pascal.py 脚本来训练SSD_300x300，那么将ssd_pascal.py复制一份，重命名为ssd_pascal_kitti.py，然后修改这个ython文件。

ssd_pascal.py脚本也有500多行，也不可能全部贴出来，这里就贴出可以修改的部分，作为一个参照。

PS.根据博友反馈，还是提供修改过的的训练脚本 ssd_pascal_kitti.py，以供参考。

自定义路径和常用参数

train_data = "examples/VOC0712/VOC0712_trainval_lmdb" # 训练数据路径，修改前
train_data = "examples/KITTI/KITTI_trainval_lmdb" # 修改后
------
test_data = "examples/VOC0712/VOC0712_test_lmdb" # 测试数据路径
test_data = "examples/KITTI/KITTI_test_lmdb"
------
model_name = "VGG_VOC0712_{}".format(job_name) # 模型名字
model_name = "KITTI_{}".format(job_name)
------
save_dir = "models/VGGNet/VOC0712/{}".format(job_name) # 模型保存路径
save_dir = "models/VGGNet/KITTI/{}".format(job_name)
------
snapshot_dir = "models/VGGNet/VOC0712/{}".format(job_name) # snapshot快照保存路径
snapshot_dir = "models/VGGNet/KITTI/{}".format(job_name)
------
job_dir = "jobs/VGGNet/VOC0712/{}".format(job_name) # job保存路径
job_dir = "jobs/VGGNet/KITTI/{}".format(job_name)
------
output_result_dir = "{}/data/VOCdevkit/results/VOC2007/{}/Main".format(os.environ['HOME'], job_name) # 测试结果txt保存路径
output_result_dir = "{}/data/KITTIdevkit/results/KITTI/{}/Main".format(os.environ['HOME'], job_name)
-------
name_size_file = "data/VOC0712/test_name_size.txt" # test_name_size.txt文件路径
name_size_file = "data/KITTI/test_name_size.txt"
-------
label_map_file = "data/VOC0712/labelmap_voc.prototxt" # label文件路径
label_map_file = "data/KITTI/labelmap_kitti.prototxt"
------
num_classes = 21 # 总类别数
num_classes = 4
------
gpus = "0,1,2,3" # 使用哪块GPU
gpus = "0" # 只有1块GPU
------
batch_size = 32 # 一次处理的图片数
batch_size = 自定义 # TITAN X刚好可以达到32，请根据显存大小调整 
------
num_test_image = 4952 # 测试图片数量
num_test_image = 自定义 # 这个数量应该和test_name_size.txt保持一致
------
run_soon = True # 生成文件后自动开始训练
run_soon = False # 手动挡

自定义训练参数

SSD的训练参数比较多，具体含义之前也了解过。但是初次试验，不太清楚该怎么调，所以决定基本先不改动其他参数（等训练一次后根据结果再调整，反复训练），只对初始学习率做调整。事实证明，对于KITTI数据集，初始学习率0.001过大，会导致网络不收敛，此处应调整为0.0001，具体如下：

# If true, use batch norm for all newly added layers.
# Currently only the non batch norm version has been tested.
use_batchnorm = False
lr_mult = 1
# Use different initial learning rate.
if use_batchnorm:
    base_lr = 0.0004
else:
    # A learning rate for batch_size = 1, num_gpus = 1.
    base_lr = 0.00004
......
if normalization_mode == P.Loss.NONE:
  base_lr /= batch_size_per_device
elif normalization_mode == P.Loss.VALID:
  base_lr *= 25. / loc_weight
elif normalization_mode == P.Loss.FULL:
  # Roughly there are 2000 prior bboxes per image.
  # TODO(weiliu89): Estimate the exact # of priors.
  base_lr *= 2000.
# 从以上代码可以看出，训练脚本没有使用batchnorm，未修改前，初始学习率 = base_lr * 25 / loc_weight=0.001
# 将base_lr变量改为原来十分之一，也就是0.00004->0.000004，就能把学习率调整为0.0001

PS.学习率0.0001确实偏小，可以适当调大一些，我自己的试验表明，0.0005~0.0007都是可选的参数，高于0.0008就会发散。

训练模型

修改完脚本参数后，运行该脚本程序。

$ cd caffe/
$ python examples/ssd/ssd_pascal_kitti.py

#ssd_pascal_kitti.py 这个文件就是我运行的文件 ZOUYU
from __future__ import print_function
import caffe
from caffe.model_libs import *
from google.protobuf import text_format

import math
import os
import shutil
import stat
import subprocess
import sys

# Add extra layers on top of a "base" network (e.g. VGGNet or Inception).
def AddExtraLayers(net, use_batchnorm=True, lr_mult=1):
    use_relu = True

    # Add additional convolutional layers.
    # 19 x 19
    from_layer = net.keys()[-1]

    # TODO(weiliu89): Construct the name using the last layer to avoid duplication.
    # 10 x 10
    out_layer = "conv6_1"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 1, 0, 1,
        lr_mult=lr_mult)

    from_layer = out_layer
    out_layer = "conv6_2"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 512, 3, 1, 2,
        lr_mult=lr_mult)

    # 5 x 5
    from_layer = out_layer
    out_layer = "conv7_1"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 128, 1, 0, 1,
      lr_mult=lr_mult)

    from_layer = out_layer
    out_layer = "conv7_2"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 1, 2,
      lr_mult=lr_mult)

    # 3 x 3
    from_layer = out_layer
    out_layer = "conv8_1"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 128, 1, 0, 1,
      lr_mult=lr_mult)

    from_layer = out_layer
    out_layer = "conv8_2"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 0, 1,
      lr_mult=lr_mult)

    # 1 x 1
    from_layer = out_layer
    out_layer = "conv9_1"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 128, 1, 0, 1,
      lr_mult=lr_mult)

    from_layer = out_layer
    out_layer = "conv9_2"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 0, 1,
      lr_mult=lr_mult)

    return net


### Modify the following parameters accordingly ###
# The directory which contains the caffe code.
# We assume you are running the script at the CAFFE_ROOT.
caffe_root = os.getcwd()

# Set true if you want to start training right after generating all files.
run_soon = True  #False
# Set true if you want to load from most recently saved snapshot.
# Otherwise, we will load from the pretrain_model defined below.
resume_training = True
# If true, Remove old model files.
remove_old_models = False

# The database file for training data. Created by data/VOC0712/create_data.sh
train_data = "examples/KITTI/KITTI_trainval_lmdb"
# The database file for testing data. Created by data/VOC0712/create_data.sh
test_data = "examples/KITTI/KITTI_test_lmdb"
# Specify the batch sampler.
resize_width = 300
resize_height = 300
resize = "{}x{}".format(resize_width, resize_height)
batch_sampler = [
        {
                'sampler': {
                        },
                'max_trials': 1,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.1,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.3,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.5,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.7,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.9,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'max_jaccard_overlap': 1.0,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        ]
train_transform_param = {
        'mirror': True,
        'mean_value': [104, 117, 123],
        'resize_param': {
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [
                        P.Resize.LINEAR,
                        P.Resize.AREA,
                        P.Resize.NEAREST,
                        P.Resize.CUBIC,
                        P.Resize.LANCZOS4,
                        ],
                },
        'distort_param': {
                'brightness_prob': 0.5,
                'brightness_delta': 32,
                'contrast_prob': 0.5,
                'contrast_lower': 0.5,
                'contrast_upper': 1.5,
                'hue_prob': 0.5,
                'hue_delta': 18,
                'saturation_prob': 0.5,
                'saturation_lower': 0.5,
                'saturation_upper': 1.5,
                'random_order_prob': 0.0,
                },
        'expand_param': {
                'prob': 0.5,
                'max_expand_ratio': 4.0,
                },
        'emit_constraint': {
            'emit_type': caffe_pb2.EmitConstraint.CENTER,
            }
        }
test_transform_param = {
        'mean_value': [104, 117, 123],
        'resize_param': {
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [P.Resize.LINEAR],
                },
        }

# If true, use batch norm for all newly added layers.
# Currently only the non batch norm version has been tested.
use_batchnorm = False
lr_mult = 1
# Use different initial learning rate.
if use_batchnorm:
    base_lr = 0.0004
else:
    # A learning rate for batch_size = 1, num_gpus = 1.
    base_lr = 0.000004

# Modify the job name if you want.
job_name = "SSD_{}".format(resize)
# The name of the model. Modify it if you want.
model_name = "KITTI_{}".format(job_name)

# Directory which stores the model .prototxt file.
save_dir = "models/VGGNet/KITTI/{}".format(job_name)
# Directory which stores the snapshot of models.
snapshot_dir = "models/VGGNet/KITTI/{}".format(job_name)
# Directory which stores the job script and log file.
job_dir = "jobs/VGGNet/KITTI/{}".format(job_name)
# Directory which stores the detection results.
output_result_dir = "{}/data/KITTIdevkit/results/KITTI/{}/Main".format(os.environ['HOME'], job_name)

# model definition files.
train_net_file = "{}/train.prototxt".format(save_dir)
test_net_file = "{}/test.prototxt".format(save_dir)
deploy_net_file = "{}/deploy.prototxt".format(save_dir)
solver_file = "{}/solver.prototxt".format(save_dir)
# snapshot prefix.
snapshot_prefix = "{}/{}".format(snapshot_dir, model_name)
# job script path.
job_file = "{}/{}.sh".format(job_dir, model_name)

# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh
name_size_file = "data/KITTI/test_name_size.txt"
# The pretrained model. We use the Fully convolutional reduced (atrous) VGGNet.
pretrain_model = "models/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel"
# Stores LabelMapItem.
label_map_file = "data/KITTI/labelmap_kitti.prototxt"

# MultiBoxLoss parameters.
num_classes = 4
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE
neg_pos_ratio = 3.
loc_weight = (neg_pos_ratio + 1.) / 4.
multibox_loss_param = {
    'loc_loss_type': P.MultiBoxLoss.SMOOTH_L1,
    'conf_loss_type': P.MultiBoxLoss.SOFTMAX,
    'loc_weight': loc_weight,
    'num_classes': num_classes,
    'share_location': share_location,
    'match_type': P.MultiBoxLoss.PER_PREDICTION,
    'overlap_threshold': 0.5,
    'use_prior_for_matching': True,
    'background_label_id': background_label_id,
    'use_difficult_gt': train_on_diff_gt,
    'mining_type': mining_type,
    'neg_pos_ratio': neg_pos_ratio,
    'neg_overlap': 0.5,
    'code_type': code_type,
    'ignore_cross_boundary_bbox': ignore_cross_boundary_bbox,
    }
loss_param = {
    'normalization': normalization_mode,
    }

# parameters for generating priors.
# minimum dimension of input image
min_dim = 300
# conv4_3 ==> 38 x 38
# fc7 ==> 19 x 19
# conv6_2 ==> 10 x 10
# conv7_2 ==> 5 x 5
# conv8_2 ==> 3 x 3
# conv9_2 ==> 1 x 1
mbox_source_layers = ['conv4_3', 'fc7', 'conv6_2', 'conv7_2', 'conv8_2', 'conv9_2']
# in percent %
min_ratio = 20
max_ratio = 90
step = int(math.floor((max_ratio - min_ratio) / (len(mbox_source_layers) - 2)))
min_sizes = []
max_sizes = []
for ratio in xrange(min_ratio, max_ratio + 1, step):
  min_sizes.append(min_dim * ratio / 100.)
  max_sizes.append(min_dim * (ratio + step) / 100.)
min_sizes = [min_dim * 10 / 100.] + min_sizes
max_sizes = [min_dim * 20 / 100.] + max_sizes
steps = [8, 16, 32, 64, 100, 300]
aspect_ratios = [[2], [2, 3], [2, 3], [2, 3], [2], [2]]
# L2 normalize conv4_3.
normalizations = [20, -1, -1, -1, -1, -1]
# variance used to encode/decode prior bboxes.
if code_type == P.PriorBox.CENTER_SIZE:
  prior_variance = [0.1, 0.1, 0.2, 0.2]
else:
  prior_variance = [0.1]
flip = True
clip = False

# Solver parameters.
# Defining which GPUs to use.
gpus = "0"
gpulist = gpus.split(",")
num_gpus = len(gpulist)

# Divide the mini-batch to different GPUs.zouyu changed from 32 to 16
batch_size = 16  #这个地方我做了修改，从32改到了16，主要是因为我的显存不太足，会OUT OF MEMOREY。。。。。。   ZOUYU
accum_batch_size = 16
iter_size = accum_batch_size / batch_size
solver_mode = P.Solver.CPU
device_id = 0
batch_size_per_device = batch_size
if num_gpus > 0:
  batch_size_per_device = int(math.ceil(float(batch_size) / num_gpus))
  iter_size = int(math.ceil(float(accum_batch_size) / (batch_size_per_device * num_gpus)))
  solver_mode = P.Solver.GPU
  device_id = int(gpulist[0])

if normalization_mode == P.Loss.NONE:
  base_lr /= batch_size_per_device
elif normalization_mode == P.Loss.VALID:
  base_lr *= 25. / loc_weight
elif normalization_mode == P.Loss.FULL:
  # Roughly there are 2000 prior bboxes per image.
  # TODO(weiliu89): Estimate the exact # of priors.
  base_lr *= 2000.

# Evaluate on whole test set.
num_test_image = 749
test_batch_size = 8
# Ideally test_batch_size should be divisible by num_test_image,
# otherwise mAP will be slightly off the true value.
test_iter = int(math.ceil(float(num_test_image) / test_batch_size))

solver_param = {
    # Train parameters
    'base_lr': base_lr,
    'weight_decay': 0.0005,
    'lr_policy': "multistep",
    'stepvalue': [80000, 100000, 120000],
    'gamma': 0.1,
    'momentum': 0.9,
    'iter_size': iter_size,
    'max_iter': 120000,
    'snapshot': 80000,
    'display': 10,
    'average_loss': 10,
    'type': "SGD",
    'solver_mode': solver_mode,
    'device_id': device_id,
    'debug_info': False,
    'snapshot_after_train': True,
    # Test parameters
    'test_iter': [test_iter],
    'test_interval': 10000,
    'eval_type': "detection",
    'ap_version': "11point",
    'test_initialization': False,
    }

# parameters for generating detection output.
det_out_param = {
    'num_classes': num_classes,
    'share_location': share_location,
    'background_label_id': background_label_id,
    'nms_param': {'nms_threshold': 0.45, 'top_k': 400},
    'save_output_param': {
        'output_directory': output_result_dir,
        'output_name_prefix': "comp4_det_test_",
        'output_format': "VOC",
        'label_map_file': label_map_file,
        'name_size_file': name_size_file,
        'num_test_image': num_test_image,
        },
    'keep_top_k': 200,
    'confidence_threshold': 0.01,
    'code_type': code_type,
    }

# parameters for evaluating detection results.
det_eval_param = {
    'num_classes': num_classes,
    'background_label_id': background_label_id,
    'overlap_threshold': 0.5,
    'evaluate_difficult_gt': False,
    'name_size_file': name_size_file,
    }

### Hopefully you don't need to change the following ###
# Check file.
check_if_exist(train_data)
check_if_exist(test_data)
check_if_exist(label_map_file)
check_if_exist(pretrain_model)
make_if_not_exist(save_dir)
make_if_not_exist(job_dir)
make_if_not_exist(snapshot_dir)

# Create train net.
net = caffe.NetSpec()
net.data, net.label = CreateAnnotatedDataLayer(train_data, batch_size=batch_size_per_device,
        train=True, output_label=True, label_map_file=label_map_file,
        transform_param=train_transform_param, batch_sampler=batch_sampler)

VGGNetBody(net, from_layer='data', fully_conv=True, reduced=True, dilated=True,
    dropout=False)

AddExtraLayers(net, use_batchnorm, lr_mult=lr_mult)

mbox_layers = CreateMultiBoxHead(net, data_layer='data', from_layers=mbox_source_layers,
        use_batchnorm=use_batchnorm, min_sizes=min_sizes, max_sizes=max_sizes,
        aspect_ratios=aspect_ratios, steps=steps, normalizations=normalizations,
        num_classes=num_classes, share_location=share_location, flip=flip, clip=clip,
        prior_variance=prior_variance, kernel_size=3, pad=1, lr_mult=lr_mult)

# Create the MultiBoxLossLayer.
name = "mbox_loss"
mbox_layers.append(net.label)
net[name] = L.MultiBoxLoss(*mbox_layers, multibox_loss_param=multibox_loss_param,
        loss_param=loss_param, include=dict(phase=caffe_pb2.Phase.Value('TRAIN')),
        propagate_down=[True, True, False, False])

with open(train_net_file, 'w') as f:
    print('name: "{}_train"'.format(model_name), file=f)
    print(net.to_proto(), file=f)
shutil.copy(train_net_file, job_dir)

# Create test net.
net = caffe.NetSpec()
net.data, net.label = CreateAnnotatedDataLayer(test_data, batch_size=test_batch_size,
        train=False, output_label=True, label_map_file=label_map_file,
        transform_param=test_transform_param)

VGGNetBody(net, from_layer='data', fully_conv=True, reduced=True, dilated=True,
    dropout=False)

AddExtraLayers(net, use_batchnorm, lr_mult=lr_mult)

mbox_layers = CreateMultiBoxHead(net, data_layer='data', from_layers=mbox_source_layers,
        use_batchnorm=use_batchnorm, min_sizes=min_sizes, max_sizes=max_sizes,
        aspect_ratios=aspect_ratios, steps=steps, normalizations=normalizations,
        num_classes=num_classes, share_location=share_location, flip=flip, clip=clip,
        prior_variance=prior_variance, kernel_size=3, pad=1, lr_mult=lr_mult)

conf_name = "mbox_conf"
if multibox_loss_param["conf_loss_type"] == P.MultiBoxLoss.SOFTMAX:
  reshape_name = "{}_reshape".format(conf_name)
  net[reshape_name] = L.Reshape(net[conf_name], shape=dict(dim=[0, -1, num_classes]))
  softmax_name = "{}_softmax".format(conf_name)
  net[softmax_name] = L.Softmax(net[reshape_name], axis=2)
  flatten_name = "{}_flatten".format(conf_name)
  net[flatten_name] = L.Flatten(net[softmax_name], axis=1)
  mbox_layers[1] = net[flatten_name]
elif multibox_loss_param["conf_loss_type"] == P.MultiBoxLoss.LOGISTIC:
  sigmoid_name = "{}_sigmoid".format(conf_name)
  net[sigmoid_name] = L.Sigmoid(net[conf_name])
  mbox_layers[1] = net[sigmoid_name]

net.detection_out = L.DetectionOutput(*mbox_layers,
    detection_output_param=det_out_param,
    include=dict(phase=caffe_pb2.Phase.Value('TEST')))
net.detection_eval = L.DetectionEvaluate(net.detection_out, net.label,
    detection_evaluate_param=det_eval_param,
    include=dict(phase=caffe_pb2.Phase.Value('TEST')))

with open(test_net_file, 'w') as f:
    print('name: "{}_test"'.format(model_name), file=f)
    print(net.to_proto(), file=f)
shutil.copy(test_net_file, job_dir)

# Create deploy net.
# Remove the first and last layer from test net.
deploy_net = net
with open(deploy_net_file, 'w') as f:
    net_param = deploy_net.to_proto()
    # Remove the first (AnnotatedData) and last (DetectionEvaluate) layer from test net.
    del net_param.layer[0]
    del net_param.layer[-1]
    net_param.name = '{}_deploy'.format(model_name)
    net_param.input.extend(['data'])
    net_param.input_shape.extend([
        caffe_pb2.BlobShape(dim=[1, 3, resize_height, resize_width])])
    print(net_param, file=f)
shutil.copy(deploy_net_file, job_dir)

# Create solver.
solver = caffe_pb2.SolverParameter(
        train_net=train_net_file,
        test_net=[test_net_file],
        snapshot_prefix=snapshot_prefix,
        **solver_param)

with open(solver_file, 'w') as f:
    print(solver, file=f)
shutil.copy(solver_file, job_dir)

max_iter = 0
# Find most recent snapshot.
for file in os.listdir(snapshot_dir):
  if file.endswith(".solverstate"):
    basename = os.path.splitext(file)[0]
    iter = int(basename.split("{}_iter_".format(model_name))[1])
    if iter > max_iter:
      max_iter = iter

train_src_param = '--weights="{}" \\\n'.format(pretrain_model)
if resume_training:
  if max_iter > 0:
    train_src_param = '--snapshot="{}_iter_{}.solverstate" \\\n'.format(snapshot_prefix, max_iter)

if remove_old_models:
  # Remove any snapshots smaller than max_iter.
  for file in os.listdir(snapshot_dir):
    if file.endswith(".solverstate"):
      basename = os.path.splitext(file)[0]
      iter = int(basename.split("{}_iter_".format(model_name))[1])
      if max_iter > iter:
        os.remove("{}/{}".format(snapshot_dir, file))
    if file.endswith(".caffemodel"):
      basename = os.path.splitext(file)[0]
      iter = int(basename.split("{}_iter_".format(model_name))[1])
      if max_iter > iter:
        os.remove("{}/{}".format(snapshot_dir, file))

# Create job file.
with open(job_file, 'w') as f:
  f.write('cd {}\n'.format(caffe_root))
  f.write('./build/tools/caffe train \\\n')
  f.write('--solver="{}" \\\n'.format(solver_file))
  f.write(train_src_param)
  if solver_param['solver_mode'] == P.Solver.GPU:
    f.write('--gpu {} 2>&1 | tee {}/{}.log\n'.format(gpus, job_dir, model_name))
  else:
    f.write('2>&1 | tee {}/{}.log\n'.format(job_dir, model_name))

# Copy the python script to job_dir.
py_file = os.path.abspath(__file__)
shutil.copy(py_file, job_dir)

# Run the job.
os.chmod(job_file, stat.S_IRWXU)
if run_soon:
  subprocess.call(job_file, shell=True)

然后模型就训练起来了，可能会遇到“out of memory”的问题，那是batch_size设的太高，导致显存不足，此时调低batch_size再重新运行即可解决。

param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    dilation: 1
  }
}
layer {
  name: "relu5_1"
  type: "ReLU"
  bottom: "conv5_1"
  top: "conv5_1"
}
layer {
  name: "conv5_2"
  type: "Convolution"
  bottom: "conv5_1"
  top: "conv5_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    dilation: 1
  }
}
layer {
  name: "relu5_2"
  type: "ReLU"
  bottom: "conv5_2"
  top: "conv5_2"
}
layer {
  name: "conv5_3"
  type: "Convolution"
  bottom: "conv5_2"
  top: "conv5_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    dilation: 1
  }
}
layer {
  name: "relu5_3"
  type: "ReLU"
  bottom: "conv5_3"
  top: "conv5_3"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5_3"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 1
    pad: 1
  }
}
layer {
  name: "fc6"
  type: "Convolution"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 1024
    pad: 6
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    dilation: 6
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "fc7"
  type: "Convolution"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 1024
    kernel_size: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "conv6_1"
  type: "Convolution"
  bottom: "fc7"
  top: "conv6_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 0
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv6_1_relu"
  type: "ReLU"
  bottom: "conv6_1"
  top: "conv6_1"
}
layer {
  name: "conv6_2"
  type: "Convolution"
  bottom: "conv6_1"
  top: "conv6_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    stride: 2
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv6_2_relu"
  type: "ReLU"
  bottom: "conv6_2"
  top: "conv6_2"
}
layer {
  name: "conv7_1"
  type: "Convolution"
  bottom: "conv6_2"
  top: "conv7_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 128
    pad: 0
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv7_1_relu"
  type: "ReLU"
  bottom: "conv7_1"
  top: "conv7_1"
}
layer {
  name: "conv7_2"
  type: "Convolution"
  bottom: "conv7_1"
  top: "conv7_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    stride: 2
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv7_2_relu"
  type: "ReLU"
  bottom: "conv7_2"
  top: "conv7_2"
}
layer {
  name: "conv8_1"
  type: "Convolution"
  bottom: "conv7_2"
  top: "conv8_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 128
    pad: 0
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv8_1_relu"
  type: "ReLU"
  bottom: "conv8_1"
  top: "conv8_1"
}
layer {
  name: "conv8_2"
  type: "Convolution"
  bottom: "conv8_1"
  top: "conv8_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 0
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv8_2_relu"
  type: "ReLU"
  bottom: "conv8_2"
  top: "conv8_2"
}
layer {
  name: "conv9_1"
  type: "Convolution"
  bottom: "conv8_2"
  top: "conv9_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 128
    pad: 0
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv9_1_relu"
  type: "ReLU"
  bottom: "conv9_1"
  top: "conv9_1"
}
layer {
  name: "conv9_2"
  type: "Convolution"
  bottom: "conv9_1"
  top: "conv9_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 0
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv9_2_relu"
  type: "ReLU"
  bottom: "conv9_2"
  top: "conv9_2"
}
layer {
  name: "conv4_3_norm"
  type: "Normalize"
  bottom: "conv4_3"
  top: "conv4_3_norm"
  norm_param {
    across_spatial: false
    scale_filler {
      type: "constant"
      value: 20
    }
    channel_shared: false
  }
}
layer {
  name: "conv4_3_norm_mbox_loc"
  type: "Convolution"
  bottom: "conv4_3_norm"
  top: "conv4_3_norm_mbox_loc"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 16
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv4_3_norm_mbox_loc_perm"
  type: "Permute"
  bottom: "conv4_3_norm_mbox_loc"
  top: "conv4_3_norm_mbox_loc_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv4_3_norm_mbox_loc_flat"
  type: "Flatten"
  bottom: "conv4_3_norm_mbox_loc_perm"
  top: "conv4_3_norm_mbox_loc_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv4_3_norm_mbox_conf"
  type: "Convolution"
  bottom: "conv4_3_norm"
  top: "conv4_3_norm_mbox_conf"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 16
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv4_3_norm_mbox_conf_perm"
  type: "Permute"
  bottom: "conv4_3_norm_mbox_conf"
  top: "conv4_3_norm_mbox_conf_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv4_3_norm_mbox_conf_flat"
  type: "Flatten"
  bottom: "conv4_3_norm_mbox_conf_perm"
  top: "conv4_3_norm_mbox_conf_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv4_3_norm_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv4_3_norm"
  bottom: "data"
  top: "conv4_3_norm_mbox_priorbox"
  prior_box_param {
    min_size: 30
    max_size: 60
    aspect_ratio: 2
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    step: 8
    offset: 0.5
  }
}
layer {
  name: "fc7_mbox_loc"
  type: "Convolution"
  bottom: "fc7"
  top: "fc7_mbox_loc"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 24
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "fc7_mbox_loc_perm"
  type: "Permute"
  bottom: "fc7_mbox_loc"
  top: "fc7_mbox_loc_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "fc7_mbox_loc_flat"
  type: "Flatten"
  bottom: "fc7_mbox_loc_perm"
  top: "fc7_mbox_loc_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "fc7_mbox_conf"
  type: "Convolution"
  bottom: "fc7"
  top: "fc7_mbox_conf"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 24
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "fc7_mbox_conf_perm"
  type: "Permute"
  bottom: "fc7_mbox_conf"
  top: "fc7_mbox_conf_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "fc7_mbox_conf_flat"
  type: "Flatten"
  bottom: "fc7_mbox_conf_perm"
  top: "fc7_mbox_conf_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "fc7_mbox_priorbox"
  type: "PriorBox"
  bottom: "fc7"
  bottom: "data"
  top: "fc7_mbox_priorbox"
  prior_box_param {
    min_size: 60
    max_size: 111
    aspect_ratio: 2
    aspect_ratio: 3
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    step: 16
    offset: 0.5
  }
}
layer {
  name: "conv6_2_mbox_loc"
  type: "Convolution"
  bottom: "conv6_2"
  top: "conv6_2_mbox_loc"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 24
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv6_2_mbox_loc_perm"
  type: "Permute"
  bottom: "conv6_2_mbox_loc"
  top: "conv6_2_mbox_loc_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv6_2_mbox_loc_flat"
  type: "Flatten"
  bottom: "conv6_2_mbox_loc_perm"
  top: "conv6_2_mbox_loc_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv6_2_mbox_conf"
  type: "Convolution"
  bottom: "conv6_2"
  top: "conv6_2_mbox_conf"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 24
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv6_2_mbox_conf_perm"
  type: "Permute"
  bottom: "conv6_2_mbox_conf"
  top: "conv6_2_mbox_conf_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv6_2_mbox_conf_flat"
  type: "Flatten"
  bottom: "conv6_2_mbox_conf_perm"
  top: "conv6_2_mbox_conf_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv6_2_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv6_2"
  bottom: "data"
  top: "conv6_2_mbox_priorbox"
  prior_box_param {
    min_size: 111
    max_size: 162
    aspect_ratio: 2
    aspect_ratio: 3
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    step: 32
    offset: 0.5
  }
}
layer {
  name: "conv7_2_mbox_loc"
  type: "Convolution"
  bottom: "conv7_2"
  top: "conv7_2_mbox_loc"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 24
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv7_2_mbox_loc_perm"
  type: "Permute"
  bottom: "conv7_2_mbox_loc"
  top: "conv7_2_mbox_loc_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv7_2_mbox_loc_flat"
  type: "Flatten"
  bottom: "conv7_2_mbox_loc_perm"
  top: "conv7_2_mbox_loc_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv7_2_mbox_conf"
  type: "Convolution"
  bottom: "conv7_2"
  top: "conv7_2_mbox_conf"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 24
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv7_2_mbox_conf_perm"
  type: "Permute"
  bottom: "conv7_2_mbox_conf"
  top: "conv7_2_mbox_conf_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv7_2_mbox_conf_flat"
  type: "Flatten"
  bottom: "conv7_2_mbox_conf_perm"
  top: "conv7_2_mbox_conf_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv7_2_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv7_2"
  bottom: "data"
  top: "conv7_2_mbox_priorbox"
  prior_box_param {
    min_size: 162
    max_size: 213
    aspect_ratio: 2
    aspect_ratio: 3
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    step: 64
    offset: 0.5
  }
}
layer {
  name: "conv8_2_mbox_loc"
  type: "Convolution"
  bottom: "conv8_2"
  top: "conv8_2_mbox_loc"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 16
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv8_2_mbox_loc_perm"
  type: "Permute"
  bottom: "conv8_2_mbox_loc"
  top: "conv8_2_mbox_loc_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv8_2_mbox_loc_flat"
  type: "Flatten"
  bottom: "conv8_2_mbox_loc_perm"
  top: "conv8_2_mbox_loc_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv8_2_mbox_conf"
  type: "Convolution"
  bottom: "conv8_2"
  top: "conv8_2_mbox_conf"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 16
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv8_2_mbox_conf_perm"
  type: "Permute"
  bottom: "conv8_2_mbox_conf"
  top: "conv8_2_mbox_conf_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv8_2_mbox_conf_flat"
  type: "Flatten"
  bottom: "conv8_2_mbox_conf_perm"
  top: "conv8_2_mbox_conf_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv8_2_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv8_2"
  bottom: "data"
  top: "conv8_2_mbox_priorbox"
  prior_box_param {
    min_size: 213
    max_size: 264
    aspect_ratio: 2
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    step: 100
    offset: 0.5
  }
}
layer {
  name: "conv9_2_mbox_loc"
  type: "Convolution"
  bottom: "conv9_2"
  top: "conv9_2_mbox_loc"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 16
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv9_2_mbox_loc_perm"
  type: "Permute"
  bottom: "conv9_2_mbox_loc"
  top: "conv9_2_mbox_loc_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv9_2_mbox_loc_flat"
  type: "Flatten"
  bottom: "conv9_2_mbox_loc_perm"
  top: "conv9_2_mbox_loc_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv9_2_mbox_conf"
  type: "Convolution"
  bottom: "conv9_2"
  top: "conv9_2_mbox_conf"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 16
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "conv9_2_mbox_conf_perm"
  type: "Permute"
  bottom: "conv9_2_mbox_conf"
  top: "conv9_2_mbox_conf_perm"
  permute_param {
    order: 0
    order: 2
    order: 3
    order: 1
  }
}
layer {
  name: "conv9_2_mbox_conf_flat"
  type: "Flatten"
  bottom: "conv9_2_mbox_conf_perm"
  top: "conv9_2_mbox_conf_flat"
  flatten_param {
    axis: 1
  }
}
layer {
  name: "conv9_2_mbox_priorbox"
  type: "PriorBox"
  bottom: "conv9_2"
  bottom: "data"
  top: "conv9_2_mbox_priorbox"
  prior_box_param {
    min_size: 264
    max_size: 315
    aspect_ratio: 2
    flip: true
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    step: 300
    offset: 0.5
  }
}
layer {
  name: "mbox_loc"
  type: "Concat"
  bottom: "conv4_3_norm_mbox_loc_flat"
  bottom: "fc7_mbox_loc_flat"
  bottom: "conv6_2_mbox_loc_flat"
  bottom: "conv7_2_mbox_loc_flat"
  bottom: "conv8_2_mbox_loc_flat"
  bottom: "conv9_2_mbox_loc_flat"
  top: "mbox_loc"
  concat_param {
    axis: 1
  }
}
layer {
  name: "mbox_conf"
  type: "Concat"
  bottom: "conv4_3_norm_mbox_conf_flat"
  bottom: "fc7_mbox_conf_flat"
  bottom: "conv6_2_mbox_conf_flat"
  bottom: "conv7_2_mbox_conf_flat"
  bottom: "conv8_2_mbox_conf_flat"
  bottom: "conv9_2_mbox_conf_flat"
  top: "mbox_conf"
  concat_param {
    axis: 1
  }
}
layer {
  name: "mbox_priorbox"
  type: "Concat"
  bottom: "conv4_3_norm_mbox_priorbox"
  bottom: "fc7_mbox_priorbox"
  bottom: "conv6_2_mbox_priorbox"
  bottom: "conv7_2_mbox_priorbox"
  bottom: "conv8_2_mbox_priorbox"
  bottom: "conv9_2_mbox_priorbox"
  top: "mbox_priorbox"
  concat_param {
    axis: 2
  }
}
layer {
  name: "mbox_loss"
  type: "MultiBoxLoss"
  bottom: "mbox_loc"
  bottom: "mbox_conf"
  bottom: "mbox_priorbox"
  bottom: "label"
  top: "mbox_loss"
  include {
    phase: TRAIN
  }
  propagate_down: true
  propagate_down: true
  propagate_down: false
  propagate_down: false
  loss_param {
    normalization: VALID
  }
  multibox_loss_param {
    loc_loss_type: SMOOTH_L1
    conf_loss_type: SOFTMAX
    loc_weight: 1
    num_classes: 4
    share_location: true
    match_type: PER_PREDICTION
    overlap_threshold: 0.5
    use_prior_for_matching: true
    background_label_id: 0
    use_difficult_gt: true
    neg_pos_ratio: 3
    neg_overlap: 0.5
    code_type: CENTER_SIZE
    ignore_cross_boundary_bbox: false
    mining_type: MAX_NEGATIVE
  }
}
I0422 11:53:44.443205 19129 layer_factory.hpp:77] Creating layer data
I0422 11:53:44.443338 19129 net.cpp:100] Creating Layer data
I0422 11:53:44.443349 19129 net.cpp:408] data -> data
I0422 11:53:44.443374 19129 net.cpp:408] data -> label
I0422 11:53:44.444303 19138 db_lmdb.cpp:35] Opened lmdb examples/KITTI/KITTI_trainval_lmdb
I0422 11:53:44.460831 19129 annotated_data_layer.cpp:62] output data size: 16,3,300,300
I0422 11:53:44.484822 19129 net.cpp:150] Setting up data
I0422 11:53:44.484858 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.484863 19129 net.cpp:157] Top shape: 1 1 1 8 (8)
I0422 11:53:44.484865 19129 net.cpp:165] Memory required for data: 17280032
I0422 11:53:44.484910 19129 layer_factory.hpp:77] Creating layer data_data_0_split
I0422 11:53:44.484937 19129 net.cpp:100] Creating Layer data_data_0_split
I0422 11:53:44.484944 19129 net.cpp:434] data_data_0_split <- data
I0422 11:53:44.484958 19129 net.cpp:408] data_data_0_split -> data_data_0_split_0
I0422 11:53:44.484968 19129 net.cpp:408] data_data_0_split -> data_data_0_split_1
I0422 11:53:44.484975 19129 net.cpp:408] data_data_0_split -> data_data_0_split_2
I0422 11:53:44.485005 19129 net.cpp:408] data_data_0_split -> data_data_0_split_3
I0422 11:53:44.485013 19129 net.cpp:408] data_data_0_split -> data_data_0_split_4
I0422 11:53:44.485018 19129 net.cpp:408] data_data_0_split -> data_data_0_split_5
I0422 11:53:44.485023 19129 net.cpp:408] data_data_0_split -> data_data_0_split_6
I0422 11:53:44.485126 19129 net.cpp:150] Setting up data_data_0_split
I0422 11:53:44.485147 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.485150 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.485153 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.485174 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.485177 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.485180 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.485183 19129 net.cpp:157] Top shape: 16 3 300 300 (4320000)
I0422 11:53:44.485185 19129 net.cpp:165] Memory required for data: 138240032
I0422 11:53:44.485189 19129 layer_factory.hpp:77] Creating layer conv1_1
I0422 11:53:44.485205 19129 net.cpp:100] Creating Layer conv1_1
I0422 11:53:44.485210 19129 net.cpp:434] conv1_1 <- data_data_0_split_0
I0422 11:53:44.485218 19129 net.cpp:408] conv1_1 -> conv1_1
I0422 11:53:45.007372 19129 net.cpp:150] Setting up conv1_1
I0422 11:53:45.007395 19129 net.cpp:157] Top shape: 16 64 300 300 (92160000)
I0422 11:53:45.007398 19129 net.cpp:165] Memory required for data: 506880032
I0422 11:53:45.007426 19129 layer_factory.hpp:77] Creating layer relu1_1
I0422 11:53:45.007442 19129 net.cpp:100] Creating Layer relu1_1
I0422 11:53:45.007447 19129 net.cpp:434] relu1_1 <- conv1_1
I0422 11:53:45.007454 19129 net.cpp:395] relu1_1 -> conv1_1 (in-place)
I0422 11:53:45.007680 19129 net.cpp:150] Setting up relu1_1
I0422 11:53:45.007688 19129 net.cpp:157] Top shape: 16 64 300 300 (92160000)
I0422 11:53:45.007709 19129 net.cpp:165] Memory required for data: 875520032
I0422 11:53:45.007711 19129 layer_factory.hpp:77] Creating layer conv1_2
I0422 11:53:45.007726 19129 net.cpp:100] Creating Layer conv1_2
I0422 11:53:45.007731 19129 net.cpp:434] conv1_2 <- conv1_1
I0422 11:53:45.007737 19129 net.cpp:408] conv1_2 -> conv1_2
I0422 11:53:45.009729 19129 net.cpp:150] Setting up conv1_2
I0422 11:53:45.009742 19129 net.cpp:157] Top shape: 16 64 300 300 (92160000)
I0422 11:53:45.009763 19129 net.cpp:165] Memory required for data: 1244160032
I0422 11:53:45.009773 19129 layer_factory.hpp:77] Creating layer relu1_2
I0422 11:53:45.009781 19129 net.cpp:100] Creating Layer relu1_2
I0422 11:53:45.009789 19129 net.cpp:434] relu1_2 <- conv1_2
I0422 11:53:45.009795 19129 net.cpp:395] relu1_2 -> conv1_2 (in-place)
I0422 11:53:45.009982 19129 net.cpp:150] Setting up relu1_2

I0422 12:14:00.465445 19129 solver.cpp:243] Iteration 840, loss = 4.30929
I0422 12:14:00.465477 19129 solver.cpp:259]     Train net output #0: mbox_loss = 4.25927 (* 1 = 4.25927 loss)
I0422 12:14:00.465487 19129 sgd_solver.cpp:138] Iteration 840, lr = 0.0001
I0422 12:14:14.878239 19129 solver.cpp:243] Iteration 850, loss = 4.49961
I0422 12:14:14.878271 19129 solver.cpp:259]     Train net output #0: mbox_loss = 4.58178 (* 1 = 4.58178 loss)
I0422 12:14:14.878279 19129 sgd_solver.cpp:138] Iteration 850, lr = 0.0001
I0422 12:14:29.397267 19129 solver.cpp:243] Iteration 860, loss = 4.43597
I0422 12:14:29.397416 19129 solver.cpp:259]     Train net output #0: mbox_loss = 4.70289 (* 1 = 4.70289 loss)
I0422 12:14:29.397428 19129 sgd_solver.cpp:138] Iteration 860, lr = 0.0001
I0422 12:14:43.483563 19129 solver.cpp:243] Iteration 870, loss = 4.79228
I0422 12:14:43.483608 19129 solver.cpp:259]     Train net output #0: mbox_loss = 5.82301 (* 1 = 5.82301 loss)
I0422 12:14:43.483619 19129 sgd_solver.cpp:138] Iteration 870, lr = 0.0001
I0422 12:14:58.156549 19129 solver.cpp:243] Iteration 880, loss = 4.56616
I0422 12:14:58.156586 19129 solver.cpp:259]     Train net output #0: mbox_loss = 5.05384 (* 1 = 5.05384 loss)
I0422 12:14:58.156595 19129 sgd_solver.cpp:138] Iteration 880, lr = 0.0001
I0422 12:15:13.084661 19129 solver.cpp:243] Iteration 890, loss = 4.44497
I0422 12:15:13.084812 19129 solver.cpp:259]     Train net output #0: mbox_loss = 4.02057 (* 1 = 4.02057 loss)
I0422 12:15:13.084825 19129 sgd_solver.cpp:138] Iteration 890, lr = 0.0001
I0422 12:15:27.360756 19129 solver.cpp:243] Iteration 900, loss = 4.80043

KITTI自动驾驶所有文件.tar.gz

在上面的压缩包中，由于JPEGImages、lmdb的文件太大在CSDN网站上上传不了，请大家谅解，如有需要可以给我留言。

截止于2018年4月22日下午16：04，我的模型还在运行当中，等模型训练好后，后续会贴出测试效果。。。

训练完成

现在时间是2018年4月23日下午13：14 ，我的模型经过16个小时（中间因为batch_size设置为16太大，内存不足中断过。）训练，终于完成了。具体情况如下图所示：

最后的mbox_loss=1.70615,detection_eval=0.567543。通过这个来看，精度不是太高。

我的观点有几个方面：

1、训练迭代次数不够，要增加迭代数。

2、因为SSD网络的输入图片的原始分辨率与KITTI的图片分辨率可能有出入，需要调整网络结构。

3、可以尝试其它网络结构来对KITTI进行训练。

测试模型

首先放上我自己在本机上测试的源码（文章后面有链接），如果大家要进行测试的话，请自行修改一些图片路径和模型路径。

# coding: utf-8

# # Detection with SSD
# 
# In this example, we will load a SSD model and use it to detect objects.

# ### 1. Setup
# 
# * First, Load necessary libs and set up caffe and caffe_root

# In[1]:

import cv2
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['figure.figsize'] = (10, 10)
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# Make sure that caffe is on the python path:
caffe_root = '/home/zy/zouyu/deeping/caffe-ssd/'  # this file is expected to be in {caffe_root}/examples
import os
os.chdir(caffe_root)
import sys
sys.path.insert(0, 'python')

import caffe
caffe.set_device(0)
caffe.set_mode_gpu()


# * Load LabelMap.

# In[2]:

from google.protobuf import text_format
from caffe.proto import caffe_pb2

# load PASCAL VOC labels
labelmap_file = 'data/VOC0712/labelmap_kitti.prototxt'
file = open(labelmap_file, 'r')
labelmap = caffe_pb2.LabelMap()
text_format.Merge(str(file.read()), labelmap)

def get_labelname(labelmap, labels):
    num_labels = len(labelmap.item)
    labelnames = []
    if type(labels) is not list:
        labels = [labels]
    for label in labels:
        found = False
        for i in xrange(0, num_labels):
            if label == labelmap.item[i].label:
                found = True
                labelnames.append(labelmap.item[i].display_name)
                break
        assert found == True
    return labelnames


# * Load the net in the test phase for inference, and configure input preprocessing.

# In[3]:

model_def ='models/VGGNet/KITTI/SSD_300x300/deploy.prototxt'
model_weights = 'models/VGGNet/KITTI/SSD_300x300/KITTI_SSD_300x300_iter_120000.caffemodel'

net = caffe.Net(model_def,      # defines the structure of the model
                model_weights,  # contains the trained weights
                caffe.TEST)     # use test mode (e.g., don't perform dropout)

# input preprocessing: 'data' is the name of the input blob == net.inputs[0]
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2, 0, 1))
transformer.set_mean('data', np.array([104,117,123])) # mean pixel
transformer.set_raw_scale('data', 255)  # the reference model operates on images in [0,255] range instead of [0,1]
transformer.set_channel_swap('data', (2,1,0))  # the reference model has channels in BGR order instead of RGB


# 
# 
# ### 2. SSD detection

# * Load an image.

# In[4]:

# set net to batch size of 1
image_resize = 300
net.blobs['data'].reshape(1,3,image_resize,image_resize)

image = caffe.io.load_image('examples/images/000088.png')
image = caffe.io.load_image('examples/images/000095.png')
#image = caffe.io.load_image('examples/images/000098.png')
#plt.imshow(image)


# * Run the net and examine the top_k results

# In[5]:

transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image

# Forward pass.
detections = net.forward()['detection_out']

# Parse the outputs.
det_label = detections[0,0,:,1]
det_conf = detections[0,0,:,2]
det_xmin = detections[0,0,:,3]
det_ymin = detections[0,0,:,4]
det_xmax = detections[0,0,:,5]
det_ymax = detections[0,0,:,6]

# Get detections with confidence higher than 0.6.
top_indices = [i for i, conf in enumerate(det_conf) if conf >= 0.25]

top_conf = det_conf[top_indices]
top_label_indices = det_label[top_indices].tolist()
top_labels = get_labelname(labelmap, top_label_indices)
top_xmin = det_xmin[top_indices]
top_ymin = det_ymin[top_indices]
top_xmax = det_xmax[top_indices]
top_ymax = det_ymax[top_indices]


# 
# * Plot the boxes


colors = plt.cm.hsv(np.linspace(0, 1, 21)).tolist()

plt.imshow(image)
currentAxis = plt.gca()

for i in xrange(top_conf.shape[0]):
    xmin = int(round(top_xmin[i] * image.shape[1]))
    ymin = int(round(top_ymin[i] * image.shape[0]))
    xmax = int(round(top_xmax[i] * image.shape[1]))
    ymax = int(round(top_ymax[i] * image.shape[0]))
    score = top_conf[i]
    label = int(top_label_indices[i])
    label_name = top_labels[i]
    display_txt = '%s: %.2f'%(label_name, score)
    coords = (xmin, ymin), xmax-xmin+1, ymax-ymin+1
    color = colors[label]
    currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor=color, linewidth=2))
    currentAxis.text(xmin, ymin, display_txt, bbox={'facecolor':color, 'alpha':0.5})

plt.show()

其次，看看我的模型训练出来后的测试效果。

通过上图，我们可以看出，训练出来的模型对于汽车（Car），这个类型分辨率还可以，而其他两个类型：行人&自行车的分辨率有待提升。后续如果有时间，我会继续改进。

总结

本博客以KITTI为基础数据集。从数据的整理、分类、注释等各方面入手，从零到一完成了整个深度学习的过程。在这一过程中碰到了各种各样的问题，有些问题也让人头痛不已。但是我想表达的思想是，只要抱着必胜的信念就一定能够完成我们的工作，让理想照进现实。最后我把我生成的模型文件放在文章的后面供大家参考。谢谢！！！

KITTI_SSD_300x300_iter_120000.caffemodel

ssd_detect_csdn.py

你可能感兴趣的:(自动驾驶)

代理IP助力AI图像处理，开启行业新篇章傻啦嘿哟关于代理IP那些事儿人工智能 tcp/ip 图像处理
目录一、代理IP技术简介二、代理IP在AI图像处理中的应用1.提升数据访问速度2.增强数据处理能力3.突破网络限制三、代理IP在AI图像处理中的实际案例案例一：AI图像生成软件案例二：AI动画创作四、代理IP技术的未来展望五、结语在科技日新月异的今天，AI图像处理技术以其广泛的应用前景和强大的处理能力，正深刻改变着我们的世界。从人脸识别、自动驾驶到医学影像分析，AI图像处理技术无处不在，发挥着不可
DeepSeek掀起推理服务器新风暴，AI应用迎来变革转折点？小深ai硬件分享人工智能大数据
AI浪潮下，推理服务器崭露头角在科技飞速发展的当下，AI是耀眼明星，席卷各行业，深刻改变生活与工作模式，从语音助手到医疗诊断、金融风险预测，AI无处不在。其发展分数据收集整理、模型训练、推理应用三个阶段，过去重模型训练，如今大量预训练模型出现，如何高效应用成新挑战，推理服务器应运而生。推理服务器是运行AI模型、对输入数据实时分析预测的硬件设备，堪称AI应用“幕后英雄”。在自动驾驶、智能安防、电商推
MapEX - Mind the map! Accounting for existing maps when estimating online HDMaps from sensors. zisuina_2 python 深度学习 3d
MapEX加入地图编码的MAP检测论文链接MapEX背景与动机：HD地图的重要性与高成本：HD地图是自动驾驶的关键组成部分，但其采集和维护成本极高。现有方法的局限性：现有方法主要依赖传感器数据生成HD地图，但通常忽略了已有的HD地图资源。当前方法最多只是对低质量地图进行地理定位或使用通用的已知地图数据库，效率和效果都有限。提出的解决方案：充分利用现有地图：在HD地图估计中引入现有HD地图资源，具体
自动驾驶感知、端到端论文集（2024-10-11）自动驾驶小学生毫米波雷达摄像头多传感器融合
文章目录1.Detection2.Segmentation（Map）3.DepthEstimation4.HighResolution5.End-to-EndAutonomousDriving1.DetectionLabelDistill:Label-guidedCross-modalKnowledgeDistillationforCamera-based3DObjectDetectionECCV
探索TotalSegmentator：一款强大的全场景图像分割工具计蕴斯Lowell
探索TotalSegmentator：一款强大的全场景图像分割工具项目地址:https://gitcode.com/gh_mirrors/to/TotalSegmentator项目简介是一个开源的、基于深度学习的全场景图像分割框架。它由开发者Wasserth创建，旨在为医学影像分析、自动驾驶、遥感图像处理等多个领域提供高效且准确的像素级分类能力。该项目的亮点在于其模型的通用性和易用性，能够处理多种
从零到入门：人工智能学习路径全解析这题有点难度人工智能学习
一、打破迷雾：重新认识人工智能人工智能（AI）早已不再是科幻电影中的专属概念，而是渗透到我们生活的方方面面。从手机里的语音助手到电商平台的推荐系统，从自动驾驶到医疗影像分析，AI技术正在重塑人类社会的运行方式。对于初学者而言，建立正确的认知框架至关重要：1.技术图谱解析：机器学习（ML）：AI的核心驱动力，使计算机具备从数据中学习的能力深度学习（DL）：基于神经网络的进阶技术，擅长处理图像、语音等
多Agent系统分工协作：从理论到工业级落地实践 ghs_gss 人工智能
文章目录一、多Agent系统核心架构1.1角色划分原则1.2通信协议设计二、工业级协作策略2.1动态任务分配2.2冲突消解机制三、典型应用案例3.1智能仓储系统3.2自动驾驶车队四、开发实践指南4.1框架选型对比4.2调试工具链一、多Agent系统核心架构1.1角色划分原则反馈感知Agent决策Agent执行Agent监控Agent角色分工规范：感知Agent：数据采集频率≥10Hz决策Agent
让AI真正“动起来“：静态工作流与动态任务规划深度解析 ghs_gss 人工智能
文章目录引言：AIAgent的进化之路一、静态工作流：企业智能化的基石1.1什么是静态工作流？1.2核心三要素：1.3电商推荐系统实战案例1.4优势与局限二、动态任务规划：AI的真正智能时刻2.1动态规划核心原理2.2自动驾驶实时规划案例2.3技术挑战与突破三、静动结合：构建企业级智能系统3.1混合架构设计3.2智能客服系统实战3.3性能对比数据四、落地实践指南4.1技术选型建议4.2实施路线图4
智能硬件定位技术发展趋势 2401_88540551 智能硬件智能手表物联网宠物智慧城市 uni-app 微信小程序
在科技飞速进步的当下，智能硬件定位技术作为众多领域的关键支撑，正沿着多元且极具创新性的路径蓬勃发展，持续重塑我们的生活与工作方式。一、精度提升的极致追求当前，智能硬件定位精度虽已满足诸多日常应用，但未来发展仍聚焦高精度突破。在自动驾驶领域，厘米级甚至毫米级定位精度至关重要。科研人员正致力于融合多种定位技术，如卫星定位、惯性导航、视觉识别与高精度地图匹配。通过复杂算法协同运作，车辆在复杂路况下能精准
亚远景-ISO/PAS 8800:2024《道路车辆—安全和人工智能》简介亚远景aspice 汽车人工智能大数据
ISO/PAS8800:2024《道路车辆—安全和人工智能》简介：ISO/PAS8800:2024《道路车辆—安全和人工智能》背景与意义随着汽车智能化发展，自动驾驶和智能座舱等技术快速进步，但人工智能在汽车领域应用面临安全性、数据质量与管理、技术标准规范缺失、公众认知和接受度等挑战。该标准旨在规范汽车领域人工智能技术应用，提高系统安全性、可靠性和兼容性，推动汽车智能化健康发展。ISO/PAS880
51-54 CVPR 2024 | DrivingGaussian：周围动态自动驾驶场景的复合高斯飞溅（ Sora能制作动作大片还需要一段时间）深圳季连AIgraphX aiXpilot 智驾大模型1 自动驾驶 AIGC stable diffusion 智慧城市计算机视觉
24年3月，北大、谷歌和加州大学共同发布了DrivingGaussian:CompositeGaussianSplattingforSurroundingDynamicAutonomousDrivingScenes。视图合成和可控模拟可以生成自动驾驶的极端场景CornerCase，这些安全关键情况有助于以更低成本验证和增强自动驾驶系统安全性。DrivingGaussian采用复合高斯飞溅进行全局渲
强化学习：原理、概念与代码实践 AndrewHZ 深度学习新浪潮人工智能深度学习强化学习机器学习算法 deepseek
一、引言强化学习（ReinforcementLearning）作为机器学习的一个重要分支，旨在通过智能体（agent）与环境的交互，学习到最优的行为策略，以最大化长期累积奖励。它在机器人控制、游戏、自动驾驶、资源管理等众多领域都取得了显著的成功。本文将深入介绍强化学习的数学原理、核心概念，并通过公式推导来加深理解，同时结合一个具体的实例，使用Python语言进行代码实现，帮助读者全面掌握强化学习的
商汤绝影端到端自动驾驶的迭代优化 AGI大模型与大数据研究院计算机软件编程原理与应用实践 java python javascript kotlin golang 架构人工智能
自动驾驶,端到端,迭代优化,深度学习,感知,规划,控制,模型训练,数据增强,模型微调1.背景介绍随着人工智能和计算机视觉技术的飞速发展，自动驾驶汽车从科幻走进了现实。商汤科技推出的绝影端到端自动驾驶系统，就是其中的佼佼者。本文将深入剖析商汤绝影端到端自动驾驶系统的迭代优化过程，帮助读者理解其背后的技术原理和架构设计。2.核心概念与联系商汤绝影端到端自动驾驶系统的核心架构如下：graphLRA[感知
【趣味随笔】盘点那些知名的机器人公司嵌小超趣味随笔机器学习机器人
：如果你也对机器人、人工智能感兴趣，看来我们志同道合✨：不妨浏览一下我的博客主页【https://blog.csdn.net/weixin_51244852】：文章若有幸对你有帮助，可点赞收藏⭐不迷路：内容若有错误，敬请留言指正！原创文，转载注明出处文章目录一、自动驾驶方向新势力车企系统供应商Robotaxi物流配送二、AR/VR硬件方向AR/VR硬件系统软件三、传感器方向双日相机RGBD相机激光
自动驾驶系列—颠覆未来驾驶：深入解析自动驾驶线控转向系统技术学步_技术自动驾驶自动驾驶人工智能机器学习线控系统
欢迎来到我的技术小筑，一个专为技术探索者打造的交流空间。在这里，我们不仅分享代码的智慧，还探讨技术的深度与广度。无论您是资深开发者还是技术新手，这里都有一片属于您的天空。让我们在知识的海洋中一起航行，共同成长，探索技术的无限可能。探索专栏：学步_技术的首页——持续学习，不断进步，让学习成为我们共同的习惯，让总结成为我们前进的动力。技术导航：人工智能：深入探讨人工智能领域核心技术。自动驾驶：分享自动
根据deepseek模型微调训练自动驾驶模型及数据集的思路 ywfwyht 自动驾驶深度学习人工智能自动驾驶人工智能机器学习
以下是使用DeepSeek模型微调训练自动驾驶模型的详细步骤和代码示例。本流程假设你已有自动驾驶领域的数据集（如驾驶指令、传感器数据等），并基于PyTorch框架实现。Step1:环境准备#安装依赖库pipinstalltorchtransformersdatasetsnumpypandasStep2:数据准备假设数据集格式为JSON，包含输入文本（传感器/场景描述）和输出控制指令：//data/
自动驾驶---Motion Planning之参考线Path平滑智能汽车人自动驾驶人工智能
1背景有了由lane_segment插值得到的粗糙参考线，这种参考线是无法输出给下游使用的，需要进一步的处理使得参考线更加平滑，才能供下游控制模块使用。Apollo中共有三种参考线平滑算法，分别为：1.QpSplineSmoother2.SpiralReferenceLineSmoother3.DiscretePointsSmoother目前Apollo中默认配置为最后一种，基于离散点的平滑。这种
自动驾驶---Motion Planning之LaneChange 智能汽车人自动驾驶人工智能
1背景在Apollo中，有比较多的Decider（决策器），上篇博客《自动驾驶---MotionPlanning之Decider》中笔者也大概介绍了每个Deicder的作用。本篇博客笔者主要介绍换道的决策内容，因为在自动驾驶中（严格意义上来讲，目前还属于辅助驾驶），变道的灵活性是用户评价该功能是否好用很重要的一部分，变道迟缓或者激进都是不好的体验，所以本篇博客会结合Apollo中的LaneChan
一文带你了解人工智能：现状、应用、变革及未来展望空青726 人工智能 chatgpt ai 大数据机器学习深度学习创业创新
近年来，人工智能（AI）的发展势头迅猛，它已经渗透到了我们生活的方方面面。从智能手机的语音助手到自动驾驶汽车，从智能家居到医疗诊断，AI正在改变着我们的生活方式。本文将结合时事，为大家介绍当前人工智能的发展形势、在生活中的应用、人工智能的变革以及未来的发展方向。一、人工智能的发展形势1.深度学习：深度学习是当前AI领域的热门话题。通过模拟人脑神经元之间的相互作用，深度学习算法能够从大量数据中提取出
基于A*算法与贝塞尔曲线的路径规划与可视化：从栅格地图到平滑路径生成机器懒得学习 pygame python
引言在机器人导航、自动驾驶和游戏开发等领域，路径规划是一个核心问题。如何高效地找到从起点到终点的最优路径，并且确保路径的平滑性和安全性，是许多应用场景中的关键挑战。本文将介绍一种结合A算法和贝塞尔曲线的路径规划方法，并通过Pygame实现可视化。我们将从栅格地图的加载与处理开始，逐步讲解A算法的实现、贝塞尔曲线的生成，以及如何通过鼠标交互实现动态路径规划。通过本文，你将掌握如何在实际项目中应用这些
《DeepSeek Janus Pro 7B：多模态人工智能大模型部署全攻略》空云风语神经网络人工智能深度学习人工智能
《DeepSeekJanusPro7B：多模态人工智能大模型部署全攻略》引言：开启多模态AI新世界在科技飞速发展的当下，多模态AI已成为人工智能领域中最耀眼的明星，正以前所未有的速度改变着我们的生活与工作方式。从智能语音助手到图像识别系统，从自动驾驶汽车到智能医疗诊断，多模态AI的身影无处不在，它让机器能够理解和处理多种类型的信息，如文本、图像、音频等，从而实现更加智能、高效的交互。DeepSee
优化算法全景解析：从梯度下降到群体智能 welcome_123_ 算法 python 人工智能
一、引言：为什么需要优化算法？在AlphaGo击败人类围棋冠军的背后，在特斯拉自动驾驶系统实时决策的瞬间，在推荐系统精准推送内容的过程中，优化算法始终是推动这些技术落地的核心引擎。无论是机器学习模型的训练，还是复杂系统的参数调优，优化算法的本质是：在给定的约束条件下，找到使目标函数最优的解。本文将深入解析优化算法的核心原理、经典方法、现代进展及实战应用，助你全面掌握这一技术利器。二、优化算法分类图
基于深度学习YOLOv8的海洋动物检测系统（Python+PySide6界面+训练代码）深度学习&目标检测实战项目深度学习 YOLO python 目标检测人工智能开发语言
引言近年来，计算机视觉技术在各行各业中得到了广泛的应用，特别是在智能监控、自动驾驶、医疗诊断等领域。深度学习，尤其是卷积神经网络（CNN）的出现，极大地提高了计算机处理图像和视频的能力。在这一领域，YOLO（YouOnlyLookOnce）系列模型以其高效且准确的目标检测能力，成为了当下最为流行的深度学习模型之一。在海洋生物保护、海洋环境监测等应用中，快速识别和检测海洋动物种类对于科学研究和保护工
无人机技术全解析：从军事靶机到低空经济新引擎 UAV_ckesc 无人机
一、无人机定义与监管体系无人机（UnmannedAerialVehicle,UAV）是指通过无线电遥控设备或自主程序控制装置操纵的无人驾驶飞行器。其核心特征包括：无驾驶舱设计：搭载自动驾驶仪、程序控制装置等设备实现自主飞行分级管理体系：中国民航局将116kg以上无人机及4600m³以上飞艇纳入融合空域管理，微型航拍器由行业协会自治二、百年发展历程1.军事起源阶段（1914-1990）1914年：英
【如何在Apollo中利用JSON/XML/config来减少代码量】勾魂凉皮 json xml
Apollo（百度的自动驾驶开源平台）是一个复杂的大型项目，涉及多个模块的协同工作，如感知、规划、控制、定位等。在这样的大型项目中，减少代码量有助于提高代码的可维护性、可读性和开发效率。使用JSON和XML等配置文件，可以将很多与代码逻辑无关的配置和行为控制从代码中分离出来，从而减少代码量和复杂性。以下是一些Apollo中可以应用的策略，来利用JSON/XML配置文件和其他技术来减少代码量和复杂性
一个AI应用的开发、上线流程解析終不似少年遊* 人工智能 ai应用 ckpt onnx 模型文件部署推理
目录1.模型文件格式1.1CheckPoint(ckpt)文件格式1.2.pth文件格式1.3.mindir文件格式1.4.onnx文件格式2.推理（Inference）2.1.pth(PyTorch模型格式)2.2.mindir(MindSpore模型格式)2.3.onnx(开放神经网络交换格式)2.4实际例子：自动驾驶系统中的推理模块3.APP与网页4.运维中心与本地部署SDK5.RAG（Re
自动驾驶软件：Tesla Autopilot二次开发_18.未来趋势与发展方向 zhubeibei168 电动汽车自动驾驶 opencv 人工智能电动汽车
18.未来趋势与发展方向未来的自动驾驶技术将不断演进，特斯拉的Autopilot系统作为行业领先的技术，也在不断地进行改进和创新。本节将探讨自动驾驶软件领域的未来趋势与发展方向，包括技术进步、法规变化、市场接受度和用户体验等方面。我们将结合特斯拉Autopilot的最新进展，分析未来可能出现的技术和应用。18.1技术进步18.1.1传感器技术的改进传感器是自动驾驶系统的关键组件，未来的传感器技术将
华为 ADS 3.0 与特斯拉 FSD V12：自动驾驶技术的巅峰对决与未来展望中科宁图华为自动驾驶人工智能
一、华为ADS3.0：多传感器融合的卓越代表（一）硬件与技术特色华为ADS3.0智能驾驶系统构建了全面的全息感知体系，融合激光雷达、高清摄像头、毫米波雷达、超声波传感器等多种设备。激光雷达实现环境三维重建和精确测距，在恶劣条件下仍能准确捕捉物体信息；高分辨率摄像头获取视觉信息；毫米波雷达在极端天气下强化对移动物体探测；超声波传感器辅助近距离障碍物检测。GOD网络融合处理多传感器数据，为决策提供坚实
自动驾驶感知系统配置分析——以“8摄像头+1毫米波雷达+12超声波雷达”为例空间机器人自动驾驶人工智能机器学习
自动驾驶感知系统配置分析——以“8摄像头+1毫米波雷达+12超声波雷达”为例1.引言自动驾驶系统依赖于传感器来感知周围环境，并基于此做出实时决策。不同类型的传感器各自有不同的特性，能够应对不同的场景和环境条件。摄像头、毫米波雷达、超声波雷达的组合能够在视觉、距离、速度和障碍物感知等方面提供全面的支持。本章节将详细介绍“8摄像头+1毫米波雷达+12超声波雷达”配置的设计思路、优势、各传感器的参数，以
AUTOSAR从入门到精通-【自动驾驶】高精地图（三）格图素书人工智能算法机器学习
目录前言算法原理高精地图发展历史A.数字地图B.增强型数字地图C.高精地图D.可扩展地图高清地图基础知识TopologicalRepresentationGeometricRepresentationSemanticRepresentationDynamicElementsFeature-BasedMapLayers自动驾驶产业与地图1.1自动驾驶技术1.2地图在自动驾驶中的作用1.3地图的层级1
Spring的注解积累 yijiesuifeng spring 注解
用注解来向Spring容器注册Bean。需要在applicationContext.xml中注册： <context:component-scan base-package=”pagkage1[,pagkage2,…,pagkageN]”/>。如：在base-package指明一个包 <context:component-sc
传感器百合不是茶 android 传感器
android传感器的作用主要就是来获取数据,根据得到的数据来触发某种事件下面就以重力传感器为例; 1,在onCreate中获得传感器服务 private SensorManager sm;// 获得系统的服务 private Sensor sensor;// 创建传感器实例 @Override protected void
[光磁与探测]金吕玉衣的意义 comsci
这是一个古代人的秘密:现在告诉大家信不信由你们: 穿上金律玉衣的人,如果处于灵魂出窍的状态,可以飞到宇宙中去看星星这就是为什么古代
精简的反序打印某个数沐刃青蛟打印
以前看到一些让求反序打印某个数的程序。比如：输入123，输出321。记得以前是告诉你是几位数的，当时就抓耳挠腮，完全没有思路。似乎最后是用到%和/方法解决的。而今突然想到一个简短的方法，就可以实现任意位数的反序打印（但是如果是首位数或者尾位数为0时就没有打印出来了）代码如下： long num, num1=0;
PHP：6种方法获取文件的扩展名 IT独行者 PHP 扩展名
PHP：6种方法获取文件的扩展名 1、字符串查找和截取的方法 1 $extension = substr ( strrchr ( $file , '.' ), 1); 2、字符串查找和截取的方法二 1 $extension = substr
面试111 文强chu 面试
1事务隔离级别有那些，事务特性是什么（问到一次） 2 spring aop 如何管理事务的，如何实现的。动态代理如何实现，jdk怎么实现动态代理的，ioc是怎么实现的，spring是单例还是多例，有那些初始化bean的方式，各有什么区别（经常问） 3 struts默认提供了那些拦截器（一次） 4 过滤器和拦截器的区别（频率也挺高） 5 final，finally final
XML的四种解析方式小桔子 dom jdom dom4j sax
在平时工作中，难免会遇到把 XML 作为数据存储格式。面对目前种类繁多的解决方案，哪个最适合我们呢？在这篇文章中，我对这四种主流方案做一个不完全评测，仅仅针对遍历 XML 这块来测试，因为遍历 XML 是工作中使用最多的（至少我认为）。　　预备　　测试环境：　　AMD 毒龙1.4G OC 1.5G、256M DDR333、Windows2000 Server
wordpress中常见的操作 aichenglong 中文注册 wordpress 移除菜单
1 wordpress中使用中文名注册解决办法 1)使用插件 2)修改wp源代码进入到wp-include/formatting.php文件中找到 function sanitize_user( $username, $strict = false
小飞飞学管理-1 alafqq 管理
项目管理的下午题，其实就在提出问题（挑刺），分析问题，解决问题。今天我随意看下10年上半年的第一题。主要就是项目经理的提拨和培养。结合我自己经历写下心得对于公司选拔和培养项目经理的制度有什么毛病呢？ 1，公司考察，选拔项目经理，只关注技术能力，而很少或没有关注管理方面的经验，能力。 2，公司对项目经理缺乏必要的项目管理知识和技能方面的培训。 3，公司对项目经理的工作缺乏进行指
IO输入输出部分探讨百合不是茶 IO
//文件处理在处理文件输入输出时要引入java.IO这个包； /* 1，运用File类对文件目录和属性进行操作 2，理解流，理解输入输出流的概念 3，使用字节/符流对文件进行读/写操作 4，了解标准的I/O 5，了解对象序列化 */ //1，运用File类对文件目录和属性进行操作 //在工程中线创建一个text.txt
getElementById的用法 bijian1013 element
getElementById是通过Id来设置/返回HTML标签的属性及调用其事件与方法。用这个方法基本上可以控制页面所有标签，条件很简单，就是给每个标签分配一个ID号。返回具有指定ID属性值的第一个对象的一个引用。语法： &n
励志经典语录 bijian1013 励志人生
经典语录1: 哈佛有一个著名的理论：人的差别在于业余时间，而一个人的命运决定于晚上8点到10点之间。每晚抽出2个小时的时间用来阅读、进修、思考或参加有意的演讲、讨论，你会发现，你的人生正在发生改变，坚持数年之后，成功会向你招手。不要每天抱着QQ/MSN/游戏/电影/肥皂剧……奋斗到12点都舍不得休息，看就看一些励志的影视或者文章，不要当作消遣；学会思考人生，学会感悟人生
[MongoDB学习笔记三]MongoDB分片 bit1129 mongodb
MongoDB的副本集(Replica Set)一方面解决了数据的备份和数据的可靠性问题，另一方面也提升了数据的读写性能。MongoDB分片(Sharding)则解决了数据的扩容问题，MongoDB作为云计算时代的分布式数据库，大容量数据存储，高效并发的数据存取，自动容错等是MongoDB的关键指标。本篇介绍MongoDB的切片(Sharding) 1.何时需要分片 &nbs
【Spark八十三】BlockManager在Spark中的使用场景 bit1129 manager
1. Broadcast变量的存储，在HttpBroadcast类中可以知道 2. RDD通过CacheManager存储RDD中的数据，CacheManager也是通过BlockManager进行存储的 3. ShuffleMapTask得到的结果数据，是通过FileShuffleBlockManager进行管理的，而FileShuffleBlockManager最终也是使用BlockMan
yum方式部署zabbix ronin47 yum方式部署zabbix
安装网络yum库#rpm -ivh http://repo.zabbix.com/zabbix/2.4/rhel/6/x86_64/zabbix-release-2.4-1.el6.noarch.rpm 通过yum装mysql和zabbix调用的插件还有agent代理#yum install zabbix-server-mysql zabbix-web-mysql mysql-
Hibernate4和MySQL5.5自动创建表失败问题解决方法 byalias J2EE Hibernate4
今天初学Hibernate4，了解了使用Hibernate的过程。大体分为4个步骤： ①创建hibernate.cfg.xml文件 ②创建持久化对象 ③创建*.hbm.xml映射文件 ④编写hibernate相应代码在第四步中，进行了单元测试，测试预期结果是hibernate自动帮助在数据库中创建数据表，结果JUnit单元测试没有问题，在控制台打印了创建数据表的SQL语句，但在数据库中
Netty源码学习-FrameDecoder bylijinnan java netty
Netty 3.x的user guide里FrameDecoder的例子，有几个疑问： 1.文档说：FrameDecoder calls decode method with an internally maintained cumulative buffer whenever new data is received. 为什么每次有新数据到达时，都会调用decode方法？ 2.Dec
SQL行列转换方法 chicony 行列转换
create table tb(终端名称 varchar(10) , CEI分值 varchar(10) , 终端数量 int) insert into tb values('三星' , '0-5' , 74) insert into tb values('三星' , '10-15' , 83) insert into tb values('苹果' , '0-5' , 93)
中文编码测试 ctrain 编码
循环打印转换编码 String[] codes = { "iso-8859-1", "utf-8", "gbk", "unicode" }; for (int i = 0; i < codes.length; i++) { for (int j
hive 客户端查询报堆内存溢出解决方法 daizj hive 堆内存溢出
hive> select * from t_test where ds=20150323 limit 2; OK Exception in thread "main" java.lang.OutOfMemoryError: Java heap space 问题原因： hive堆内存默认为256M 这个问题的解决方法为：修改/us
人有多大懒，才有多大闲 (评论『卓有成效的程序员』) dcj3sjt126com 程序员
卓有成效的程序员给我的震撼很大，程序员作为特殊的群体，有的人可以这么懒，懒到事情都交给机器去做，而有的人又可以那么勤奋，每天都孜孜不倦得做着重复单调的工作。在看这本书之前，我属于勤奋的人，而看完这本书以后，我要努力变成懒惰的人。不要在去庞大的开始菜单里面一项一项搜索自己的应用程序，也不要在自己的桌面上放置眼花缭乱的快捷图标
Eclipse简单有用的配置 dcj3sjt126com eclipse
1、显示行号 Window -- Prefences -- General -- Editors -- Text Editors -- show line numbers 2、代码提示字符 Window ->Perferences，并依次展开 Java -> Editor -> Content Assist，最下面一栏 auto-Activation
在tomcat上面安装solr4.8.0全过程 eksliang Solr solr4.0后的版本安装 solr4.8.0安装
转载请出自出处： http://eksliang.iteye.com/blog/2096478 首先solr是一个基于java的web的应用，所以安装solr之前必须先安装JDK和tomcat，我这里就先省略安装tomcat和jdk了第一步：当然是下载去官网上下载最新的solr版本，下载地址
Android APP通用型拒绝服务、漏洞分析报告 gg163 漏洞 android APP 分析
点评：记得曾经有段时间很多SRC平台被刷了大量APP本地拒绝服务漏洞，移动安全团队爱内测（ineice.com）发现了一个安卓客户端的通用型拒绝服务漏洞，来看看他们的详细分析吧。 0xr0ot和Xbalien交流所有可能导致应用拒绝服务的异常类型时，发现了一处通用的本地拒绝服务漏洞。该通用型本地拒绝服务可以造成大面积的app拒绝服务。针对序列化对象而出现的拒绝服务主要
HoverTree项目已经实现分层 hvt 编程 .net Web C#ASP.ENT
HoverTree项目已经初步实现分层，源代码已经上传到 http://hovertree.codeplex.com请到SOURCE CODE查看。在本地用SQL Server 2008 数据库测试成功。数据库和表请参考：http://keleyi.com/a/bjae/ue6stb42.htmHoverTree是一个ASP.NET 开源项目，希望对你学习ASP.NET或者C#语言有帮助，如果你对
Google Maps API v3: Remove Markers 移除标记天梯梦 google maps api
Simply do the following: I. Declare a global variable: var markersArray = []; II. Define a function: function clearOverlays() { for (var i = 0; i < markersArray.length; i++ )
jQuery选择器总结 lq38366 jquery 选择器
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
基础数据结构和算法六：Quick sort sunwinner Algorithm Quicksort
Quick sort is probably used more widely than any other. It is popular because it is not difficult to implement, works well for a variety of different kinds of input data, and is substantially faster t
如何让Flash不遮挡HTML div元素的技巧_HTML/Xhtml_网页制作刘星宇 html Web
今天在写一个flash广告代码的时候，因为flash自带的链接，容易被当成弹出广告，所以做了一个div层放到flash上面，这样链接都是a触发的不会被拦截，但发现flash一直处于div层上面，原来flash需要加个参数才可以。让flash置于DIV层之下的方法，让flash不挡住飘浮层或下拉菜单，让Flash不档住浮动对象或层的关键参数：wmode=opaque。方法如下：
Mybatis实用Mapper SQL汇总示例 wdmcygah sql mysql mybatis 实用
Mybatis作为一个非常好用的持久层框架，相关资料真的是少得可怜，所幸的是官方文档还算详细。本博文主要列举一些个人感觉比较常用的场景及相应的Mapper SQL写法，希望能够对大家有所帮助。不少持久层框架对动态SQL的支持不足，在SQL需要动态拼接时非常苦恼，而Mybatis很好地解决了这个问题，算是框架的一大亮点。对于常见的场景，例如：批量插入/更新/删除，模糊查询，多条件查询，联表查询，