来源:投稿 作者:LSC
编辑:学姐
最终成绩是:68.46473 第二名
http://challenge.xfyun.cn/topic/info?type=helmet-wear&option=ssgy
在本次比赛中需要参赛选手搭建计算机视觉模型识别出照片中的安全帽位置。
本次赛题包括三类目标物体:Helmet,Person,Head,训练集4千张图片,测试集1千张图片。训练集数据集标注格式为:
本次竞赛的评价标准采用mAP(mean Average Precision)准确率指标,最高分为1。计算方法参考代码参考:https://github.com/Cartucho/mAP
识别结果文件详细说明:
(1)标签顺序需要与测试集文本保持一致;
(2)具体格式如下:
提交对应文件名以txt结尾的文件结果文件,如hard_hat_workers4007.txt,提交文件内容的格式为:
baseline代码我是在autodl
平台上运行的,我是用PaddleDetection
框架来进行检测的,要先将数据集和PaddleDetection.zip
上传到平台
(1)首先是解压数据集和安装所需环境
!unzip autodl-nas/train_images.zip -d autodl-tmp/
!unzip autodl-nas/train_anns.zip -d autodl-tmp/
!unzip autodl-nas/test_images.zip -d autodl-tmp/
!unzip autodl-nas/PaddleDetection.zip
!pip install -r /root/PaddleDetection/requirements.txt
import os
anns = "/root/autodl-tmp/train_anns/"
num = {}
for t in os.listdir(anns):
f = open(anns + t, "r", encoding="utf-8")
lines = f.readlines()
for line in lines:
p = line.strip().split(" ")
num[p[0]] = num.get(p[0], 0) + 1
f.close()
num
!pip install opencv-python
from xml.dom.minidom import Document
import os
import cv2
(2)将标签文件由txt
格式转成xml
格式,再转为json
格式,见train1.ipynb
中的代码,主要是运行makexml函数
和PaddleDetection/tools/x2coco.py
模块
def makexml(picPath, txtPath, xmlPath): # txt所在文件夹路径,xml文件保存路径,图片所在文件夹路径
"""此函数用于将yolo格式txt标注文件转换为voc格式xml标注文件
在自己的标注图片文件夹下建三个子文件夹,分别命名为picture、txt、xml
"""
# dic = {'0': "hat", # 创建字典用来对类型进行转换
# '1': "person", # 此处的字典要与自己的classes.txt文件中的类对应,且顺序要一致
# }
files = os.listdir(txtPath)
for i, name in enumerate(files):
xmlBuilder = Document()
annotation = xmlBuilder.createElement("annotation") # 创建annotation标签
xmlBuilder.appendChild(annotation)
txtFile = open(txtPath + name)
txtList = txtFile.readlines()
img = cv2.imread(picPath + name[0:-4] + ".png")
Pheight, Pwidth, Pdepth = img.shape
folder = xmlBuilder.createElement("folder") # folder标签
foldercontent = xmlBuilder.createTextNode("driving_annotation_dataset")
folder.appendChild(foldercontent)
annotation.appendChild(folder) # folder标签结束
filename = xmlBuilder.createElement("filename") # filename标签
filenamecontent = xmlBuilder.createTextNode(name[0:-4] + ".png")
filename.appendChild(filenamecontent)
annotation.appendChild(filename) # filename标签结束
size = xmlBuilder.createElement("size") # size标签
width = xmlBuilder.createElement("width") # size子标签width
widthcontent = xmlBuilder.createTextNode(str(Pwidth))
width.appendChild(widthcontent)
size.appendChild(width) # size子标签width结束
height = xmlBuilder.createElement("height") # size子标签height
heightcontent = xmlBuilder.createTextNode(str(Pheight))
height.appendChild(heightcontent)
size.appendChild(height) # size子标签height结束
depth = xmlBuilder.createElement("depth") # size子标签depth
depthcontent = xmlBuilder.createTextNode(str(Pdepth))
depth.appendChild(depthcontent)
size.appendChild(depth) # size子标签depth结束
annotation.appendChild(size) # size标签结束
for j in txtList:
oneline = j.strip().split(" ")
object = xmlBuilder.createElement("object") # object 标签
picname = xmlBuilder.createElement("name") # name标签
namecontent = xmlBuilder.createTextNode(oneline[0])
picname.appendChild(namecontent)
object.appendChild(picname) # name标签结束
pose = xmlBuilder.createElement("pose") # pose标签
posecontent = xmlBuilder.createTextNode("Unspecified")
pose.appendChild(posecontent)
object.appendChild(pose) # pose标签结束
truncated = xmlBuilder.createElement("truncated") # truncated标签
truncatedContent = xmlBuilder.createTextNode("0")
truncated.appendChild(truncatedContent)
object.appendChild(truncated) # truncated标签结束
difficult = xmlBuilder.createElement("difficult") # difficult标签
difficultcontent = xmlBuilder.createTextNode("0")
difficult.appendChild(difficultcontent)
object.appendChild(difficult) # difficult标签结束
bndbox = xmlBuilder.createElement("bndbox") # bndbox标签
xmin = xmlBuilder.createElement("xmin") # xmin标签
# mathData = int(((float(oneline[1])) * Pwidth + 1) - (float(oneline[3])) * 0.5 * Pwidth)
mathData = int(oneline[1])
xminContent = xmlBuilder.createTextNode(str(mathData))
xmin.appendChild(xminContent)
bndbox.appendChild(xmin) # xmin标签结束
ymin = xmlBuilder.createElement("ymin") # ymin标签
# mathData = int(((float(oneline[2])) * Pheight + 1) - (float(oneline[4])) * 0.5 * Pheight)
mathData = int(oneline[2])
yminContent = xmlBuilder.createTextNode(str(mathData))
ymin.appendChild(yminContent)
bndbox.appendChild(ymin) # ymin标签结束
xmax = xmlBuilder.createElement("xmax") # xmax标签
# mathData = int(((float(oneline[1])) * Pwidth + 1) + (float(oneline[3])) * 0.5 * Pwidth)
mathData = int(oneline[3])
xmaxContent = xmlBuilder.createTextNode(str(mathData))
xmax.appendChild(xmaxContent)
bndbox.appendChild(xmax) # xmax标签结束
ymax = xmlBuilder.createElement("ymax") # ymax标签
# mathData = int(((float(oneline[2])) * Pheight + 1) + (float(oneline[4])) * 0.5 * Pheight)
mathData = int(oneline[4])
ymaxContent = xmlBuilder.createTextNode(str(mathData))
ymax.appendChild(ymaxContent)
bndbox.appendChild(ymax) # ymax标签结束
object.appendChild(bndbox) # bndbox标签结束
annotation.appendChild(object) # object标签结束
f = open(xmlPath + name[0:-4] + ".xml", 'w')
xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8')
f.close()
picPath = "/root/autodl-tmp/train_images/" # 图片所在文件夹路径,后面的/一定要带上
txtPath = "/root/autodl-tmp/train_anns/" # txt所在文件夹路径,后面的/一定要带上
xmlPath = "/root/autodl-tmp/train_xmls/" # xml文件保存路径,后面的/一定要带上
makexml(picPath, txtPath, xmlPath)
f = open("/root/autodl-tmp/voc_train.txt", "w", encoding="utf-8")
for i in os.listdir("/root/autodl-tmp/train_xmls/"):
if i.endswith("xml"):
f.write(i+"\n")
f.close()
# 把xml转成json
%cd /root/
!python PaddleDetection/tools/x2coco.py \
--dataset_type voc \
--voc_anno_dir /root/autodl-tmp/train_xmls \
--voc_anno_list /root/autodl-tmp/voc_train.txt \
--voc_label_list /root/autodl-tmp/label_list.txt \
--voc_out_name /root/autodl-tmp/train.json
(3)修改PaddleDetection
中的'PaddleDetection/configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml'
文件及其相关文件'../datasets/coco_detection.yml'
, '../runtime.yml'
, './_base_/optimizer_80e.yml'
,'./_base_/ppyoloe_plus_crn.yml'
, './_base_/ppyoloe_plus_reader.yml'
等,重点是coco_detection.yml
的文件路径,ppyoloe_plus_crn_x_80e_coco.yml
里面的pretrain_weights
,是预训练模型权重这是修改后ppyoloe_plus_crn_x_80e_coco.yml
的内容。
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'./_base_/optimizer_80e.yml',
'./_base_/ppyoloe_plus_crn.yml',
'./_base_/ppyoloe_plus_reader.yml',
]
log_iter: 100
snapshot_epoch: 1
weights: /root/autodl-tmp/model_final
pretrain_weights: /root/autodl-tmp/model_out/best_model.pdparams
#https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_x_obj365_pretrained.pdparams
depth_mult: 1.33
width_mult: 1.25
这是修改后的coco_detection.yml的内容
metric: COCO
num_classes: 3
TrainDataset:
!COCODataSet
image_dir: train_images
anno_path: annotations/train.json
dataset_dir: /root/autodl-tmp/data/
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
!COCODataSet
image_dir: train_images
anno_path: annotations/train.json
dataset_dir: /root/autodl-tmp/data/
TestDataset:
!ImageFolder
anno_path: /root/autodl-tmp/label_list.txt # annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
# dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
修改后的runtime.yml的内容:
use_gpu: true
use_xpu: false
log_iter: 20
save_dir: /root/autodl-tmp/model_out/ # *****
snapshot_epoch: 1
print_flops: false
# Exporting the model
export:
post_process: True # Whether post-processing is included in the network when export model.
nms: True # Whether NMS is included in the network when export model.
benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
fuse_conv_bn: False
修改后的optimizer_80e.yml的内容:
epoch: 20
LearningRate:
base_lr: 0.001
schedulers:
- !CosineDecay
max_epochs: 96
- !LinearWarmup
start_factor: 0.
epochs: 5
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005
type: L2
(4)运行如下命令行代码,开始训练模型
!export CUDA_VISIBLE_DEVICES=0 #windows和Mac下不需要执行该命令
%cd /root/PaddleDetection/
!python tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml --eval --amp
是用所有训练集数据进行训练和验证,epoch是20,但是到了第10轮效果没有高效地提升,停止了训练,保存最好的模型。
(5)运行如下命令行代码,推理测试集数据
!export CUDA_VISIBLE_DEVICES=0 #windows和Mac下不需要执行该命令
%cd /root/PaddleDetection/
!python tools/infer.py -c configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml \
--infer_dir=/root/autodl-tmp/test_images/ \
--output_dir=/root/autodl-tmp/infer_output2/ \
--draw_threshold=0.5 \
-o weights=/root/autodl-tmp/model_out/ppyoloe_plus_crn_x_80e_coco/best_model \
--use_vdl=True --save_results=True
其中weights
参数是最好的训练模型的权重的绝对路径,output_dir
是结果输出绝对路径,infer_dir
是测试集图片数据的绝对路径,最终生成推理结果的图片和测试集的推理结果bbox.json
都存储在autodl-tmp/infer_output2/
路径下。
(6)对autodl-tmp/infer_output2/bbox.json
数据进行读取和处理,将每张测试集图片的结果写在同名的txt文本下,存储在/root/detection-results
目录中。
import json
f = open("/root/autodl-tmp/infer_output2/bbox.json", "r")
data = json.load(f)
len(data)
这个时候要注意生成的bbox
不知道对应的原图是哪个,所以对PaddelDetection
的代码要稍微做修改,在PaddleDetection/ppdet/engine/trainer.py
代码的第841行后加入:
# 把这个映射保存起来
print("imid2path ***** ", imid2path)
import pickle
with open("/root/autodl-tmp/imid2path.pkl", "wb") as tf:
pickle.dump(imid2path,tf)
import pickle
with open("/root/autodl-tmp/imid2path.pkl", "rb") as tf:
imid2path = pickle.load(tf)
ans = [[] for _ in range(1000)]
m = {0:'head', 1:'helmet', 2: 'person'}
for i in range(len(data)):
s = ""
d = data[i]
index = int(d['image_id'])
# img_name = os.path.basename(imid2path[index])
s += m[d['category_id']] + " "
s += str(d['score']) + " "
bbox = d['bbox']
x1, y1, w, h = int(bbox[0]), int(bbox[1]), int(bbox[2]), int(bbox[3])
x2, y2 = x1 + w, y1 + h
s += str(x1) + " " + str(y1) + " " + str(x2) + " " + str(y2)
if x1 < 0 or y1 < 0 or x2 < 0 or y2 < 0:
continue
ans[index].append(s)
# break
for i in range(1000):
img_name = os.path.basename(imid2path[i])
txt_name = img_name[:-4] + ".txt"
f = open("/root/detection-results/" + txt_name, "w", encoding="utf-8")
for t in ans[i]:
f.write(t + "\n")
f.close()
# break
(7)最后用命令行把/root/detection-results
文件夹压缩下载到本地%cd /root/!tar -cvzf detection-results.tar.gz detection-results
解压再压缩改为zip文件进行提交
点击卡片关注公众号
回复“all in”领取kaggle赛题解析合集