目录
任务描述
实验环境
一、目标检测任务:船只检测
1.数据集
2、部署yolo模型
修改cfg文件
修改data文件夹
进行训练
训练结果
二、基于flask的web服务
app.py
index1.html
detect.py(部分)推理:
展示效果
github地址在本文最后
(1)目标检测任务:船只检测
任务:训练模型,解决海上船舶目标的分类和检测(mAP>90%)
数据:SeaShips数据集(7000)
(2)云服务搭建
任务:实现一个基于HTTP的AI服务,API支持输入图片链接,返回图片中船舶的类别及检测框等信息
(3)web展示
任务:实现一个web页面,可以通过在文本框中输入图片链接,请求步骤2中的HTTP服务,把包含的船舶类别及检测框绘制在图片上并呈现出来
本人是在AutoDL网站是租用的服务器,具体信息如下
该版本数据集共有7000张图片,图片分辨率均为1920*1080,分为六类船只,主要是一些内河航道中船只的图片。
我根据链接把数据集下载到本地,其基本结构如上
1.对数据集的预处理
该数据集共有3个文件夹,其中Annotations文件夹内是每张图片的标签,每个图片对应一个xml文件来存储标签,所以要将xml文件中的内容转为txt文件,每张照片对应一个txt文件,txt文件命名与照片命名一致,如00xxxx.jpg对应的标签文件为00xxxx.xml,我们要将00xxxx.xml文件转为00xxxx.txt文件。
xml文件给出了每一个目标的类别,以及对应的bounding box的左上角和右下角的坐标,对应的均为1920*1080分辨率。
所以:转成txt文件要求是,一张照片对应一个txt文件,一个txt文件每一行对应一个船只目标,每一行五个数,第一个是类别,0~5分别代表六类船只,剩下的四个数依次为bounding box中心点的x和y坐标,然后是框的width和height,所有的数据均为归一化后的数据,x和width要除以1920,y和height要除以1080.
根据如上的要求,我写了一个python文件,进行简单的操作,进行转换
但是要注意的是:
本次实验我需要部署的是yolo v5模型,而它本身是需要类似coco的数据集格式才能使用,所以我们需要所有图片全部放在JPEGImages文件夹下,所有标签全部放在labels文件夹下,然后这两个文件下放在同一个文件夹下。
PS:我一开始就没有注意格式,所以部署yolo模型的时候出现了不少奇怪的bug
2.标注自己的数据集
使用
LabelImg 标记图片
标记图片就是对图片中的待识别目标进行标记,如果识别的目标时猫、狗,那就用方块标记出猫或狗,它们的标签分别为cat、dog。标记完成后,生成与原文件名相同的.xml
文件。(正在学习)
git地址:GitHub - ultralytics/yolov5: YOLOv5 in PyTorch > ONNX > CoreML > TFLite
git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # install
cfg文件里面定义了网络框架,代码中建立模型就是根据cfg文件建立的,cfg文件夹内的cfg文件对应不同的网络架构,yolov5,yolov5-spp,yolov5-tiny,复杂度和精确度依次下降,原始yolo含有255个输出,[4 box coordinates + 1 object confidence + 80 class confidences]*3,乘3是因为yolov5在3个尺度上进行预测。因为这里只有6类船只,因此搜索filters=255,将255改为 33=(4+1+6)*3 即可。
打开文件夹可看到有.data .names .txt三种类型文件。.data文件定义了类别数量classes,修改为6。第二行train=,这里改为一个txt文件的路径,这个txt文件里面是训练集所有图片的路径,每一行是一张图片的路径。
python train.py --epochs 100 --cfg models/yolov5s.yaml --data data/myship.yaml --weights yolov5s.pt --batch-size 32
该命令通过指定数据集、批量大小、图像大小以及预训练,在定义的数据集上进行训练
Flask是一种用python实现轻量级的web服务,也称为微服务,其灵活性较强而且效率高,在深度学习方面,也常常用来部署B/S模型。本项目中,我就基于flask实现web服务api接口,使用的是在SeaShip上训练好的权重文件(best.pt)
其中:
static、templates以及app.py是参与flask架构的相关文件
此外,我还对detect,py自行进行了修改,添加了DetectAPI类方便我进行图片识别
主要文件如下
import cv2
import time
from flask import Flask, request, Response,render_template
import json
import detect
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
app = Flask(__name__)
basedir = os.path.abspath(os.path.dirname(__file__))#basedir:D:\\yolov5\\yolov5
class_names = ['ore carrier','general cargo ship','bulk cargo carrier','container ship','fishing boat','passenger ship']
file_name = ['jpg','jpeg','png']
@app.route('/images', methods= ['POST'])
def get_image():
image = request.files["images"]
path = basedir + "\\data\\images"
image_name = image.filename
file_path = path + image.filename
image.save(file_path)
Path = 'yolov5/data/images'
if image_name.split(".")[-1] in file_name:
detect_api = detect.DetectAPI(exist_ok=True)
img = cv2.imread(file_path)
cv2.imwrite(os.path.join(path, 'test.jpg'), img)
label = detect_api.run()
with open("runs/detect/myexp/test.jpg", 'rb') as f:
image = f.read()
resp = Response(image, mimetype='image/jpeg')
try:
return resp
#return Response(response=lable, status=200, contenetype='text/html;charset=utf-8')
except:
return render_template('index1.html')
@app.route('/')
def upload_file():
return render_template('index1.html')
if __name__ == '__main__':
app.run(debug=True, host='127.0.0.1', port=5000)
yolo deepsort
SeaShip Detection Platform
class DetectAPI:
def __init__(self, weights='weights/best.pt', data='data/myship.yaml', imgsz=None, conf_thres=0.25,
iou_thres=0.45, max_det=1000, device='0', view_img=False, save_txt=False,
save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False,
visualize=False, update=False, project='runs/detect', name='myexp', exist_ok=False, line_thickness=3,
hide_labels=False, hide_conf=False, half=False, dnn=False):
if imgsz is None:
self.imgsz = [640, 640]
self.weights = weights
self.data = data
self.source = 'data/images'
self.imgsz = [640, 640]
self.conf_thres = conf_thres
self.iou_thres = iou_thres
self.max_det = max_det
self.device = device
self.view_img = view_img
self.save_txt = save_txt
self.save_conf = save_conf
self.save_crop = save_crop
self.nosave = nosave
self.classes = classes
self.agnostic_nms = agnostic_nms
self.augment = augment
self.visualize = visualize
self.update = update
self.project = project
self.name = name
self.exist_ok = exist_ok
self.line_thickness = line_thickness
self.hide_labels = hide_labels
self.hide_conf = hide_conf
self.half = half
self.dnn = dnn
def run(self):
source = str(self.source)
save_img = not self.nosave and not source.endswith('.txt') # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
if is_url and is_file:
source = check_file(source) # download
# Directories
save_dir = increment_path(Path(self.project) / self.name, exist_ok=self.exist_ok) # increment run
(save_dir / 'labels' if self.save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
device = select_device(self.device)
model = DetectMultiBackend(self.weights, device=device, dnn=self.dnn, data=self.data)
stride, names, pt, jit, onnx, engine = model.stride, model.names, model.pt, model.jit, model.onnx, model.engine
imgsz = check_img_size(self.imgsz, s=stride) # check image size
# Half
self.half &= (
pt or jit or onnx or engine) and device.type != 'cpu' # FP16 supported on limited backends with CUDA
if pt or jit:
model.model.half() if self.half else model.model.float()
# Dataloader
if webcam:
view_img = check_imshow()
cudnn.benchmark = True # set True to speed up constant image size inference
dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt)
bs = len(dataset) # batch_size
else:
dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt)
bs = 1 # batch_size
vid_path, vid_writer = [None] * bs, [None] * bs
# Run inference
model.warmup(imgsz=(1, 3, *imgsz)) # warmup
dt, seen = [0.0, 0.0, 0.0], 0
for path, im, im0s, vid_cap, s in dataset:
t1 = time_sync()
im = torch.from_numpy(im).to(device)
im = im.half() if self.half else im.float() # uint8 to fp16/32
im /= 255 # 0 - 255 to 0.0 - 1.0
if len(im.shape) == 3:
im = im[None] # expand for batch dim
t2 = time_sync()
dt[0] += t2 - t1
# Inference
visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if self.visualize else False
pred = model(im, augment=self.augment, visualize=visualize)
t3 = time_sync()
dt[1] += t3 - t2
# NMS
pred = non_max_suppression(pred, self.conf_thres, self.iou_thres, self.classes, self.agnostic_nms,
max_det=self.max_det)
dt[2] += time_sync() - t3
# Second-stage classifier (optional)
# pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)
# Process predictions
for i, det in enumerate(pred): # per image
seen += 1
if webcam: # batch_size >= 1
p, im0, frame = path[i], im0s[i].copy(), dataset.count
s += f'{i}: '
else:
p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)
p = Path(p) # to Path
save_path = str(save_dir / p.name) # im.jpg
txt_path = str(save_dir / 'labels' / p.stem) + (
'' if dataset.mode == 'image' else f'_{frame}') # im.txt
s += '%gx%g ' % im.shape[2:] # print string
gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
imc = im0.copy() if self.save_crop else im0 # for save_crop
annotator = Annotator(im0, line_width=self.line_thickness, example=str(names))
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(im.shape[2:], det[:, :4], im0.shape).round()
# Print results
for c in det[:, -1].unique():
n = (det[:, -1] == c).sum() # detections per class
s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string
mylabel = []
# Write results
for *xyxy, conf, cls in reversed(det):
if self.save_txt: # Write to file
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if self.save_conf else (cls, *xywh) # label format
with open(txt_path + '.txt', 'a') as f:
f.write(('%g ' * len(line)).rstrip() % line + '\n')
if save_img or self.save_crop or self.view_img: # Add bbox to image
c = int(cls) # integer class
label = None if self.hide_labels else (
names[c] if self.hide_conf else f'{names[c]} {conf:.2f}')
mylabel.append(str(label))
annotator.box_label(xyxy, label, color=colors(c, True))
if self.save_crop:
save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True)
# Print time (inference-only)
LOGGER.info(f'{s}Done. ({t3 - t2:.3f}s)')
# Stream results
im0 = annotator.result()
if self.view_img:
cv2.imshow(str(p), im0)
cv2.waitKey(1) # 1 millisecond
# Save results (image with detections)
if save_img:
if dataset.mode == 'image':
cv2.imwrite(save_path, im0)
else: # 'video' or 'stream'
if vid_path[i] != save_path: # new video
vid_path[i] = save_path
if isinstance(vid_writer[i], cv2.VideoWriter):
vid_writer[i].release() # release previous video writer
if vid_cap: # video
fps = vid_cap.get(cv2.CAP_PROP_FPS)
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
else: # stream
fps, w, h = 30, im0.shape[1], im0.shape[0]
save_path += '.mp4'
vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
vid_writer[i].write(im0)
# Print results
t = tuple(x / seen * 1E3 for x in dt) # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if self.save_txt or save_img:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if self.save_txt \
else ''
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if self.update:
strip_optimizer(self.weights) # update model (to fix SourceChangeWarning)
return mylabel
最后的Mylabel返回检测的结果以及精度
1、运行app.py
等待它返回url
访问返回的url进行操作
点击Analyze,便可进行分析得到结果了
并且在终端中也会又更加详细的信息:
最后附上本次项目的github地址:(同时也欢迎大家对此项目进行讨论和建议):
https://github.com/wjjweb1/yolov5_flask_seaship