参考链接:
https://blog.csdn.net/qq_44876051/article/details/107665310?utm_medium=distribute.pc_relevant.none-task-blog-baidujs_baidulandingword-2&spm=1001.2101.3001.4242
https://www.cnblogs.com/wujianming-110117/p/13845974.html
pytorch代码实现:https://github.com/bubbliiiing/yolov4-pytorch
主要修改点:
1. 修改upsample算子的实现,使用interpolate代替(修改yolo4.py)
2. 修改模型权重加载方法,排除upsample的权重加载,否则会报错(修改train.py)
模型训练+推理步骤:
1. 下载代码和预训练模型,准备数据
2. 数据预处理:使用json2xml.py、kmeans_for_anchors.py、voc2yolo4.py和voc_annotation.py对自己标注的数据进行处理
3. 训练模型+测试模型(修改train.py和predict.py)
4. 计算map(修改get_dr_txt.py、get_gt_txt.py、get_map.py,需要注意数据集的格式)
5. pytorch模型转onnx模型(修改pt2onnx.py)
6. 测试pth模型+onnx模型(修改test_pth.py、test_onnx.py)
7. onnx转om模型(atc命令如下)
8. 测试om模型(修改pyacl代码,不使用等比例缩放)
9. 对比结果(把om模型的输出拿出来,放到test_om.py / test_om2.py中测试,对比本地模型和atlas模型的结果)
备注:test_om.py输入的是om模型经过dvpp+aipp的输入数据,test_om2.py输入的是三个feature_map。
结果对比结论:
1. 本地对比了pytorch和onnx模型,结果保持一致
2. 把om模型的输入截出来,当做pth或onnx模型的输入,得到的结果和om模型的真实输出结果相差不大,证明om模型转换成功
3.om模型结果和本地pth或onnx模型结果相差不大,证明模型迁移成功
数据标注 -> yolo格式转换 -> 计算anchors
具体代码步骤参考上一篇文章:https://blog.csdn.net/gm_Ergou/article/details/118570318
train.txt
data/dataset2/coins/P00524-151911.jpg 1660,402,2145,894,0 2546,714,3025,1205,0 2929,1121,3408,1612,0 2175,1205,2642,1666,0 1840,2091,2307,2552,0 1444,870,1929,1349,1 2642,1923,3037,2331,1 941,1810,1301,2175,2 1762,1301,2139,1660,1
data/dataset2/coins/P00524-151918.jpg 923,534,1301,894,0 2786,343,3151,726,0 1882,798,2235,1157,0 1888,1666,2265,2043,0 2666,1780,3025,2139,0 2331,2001,2666,2355,0 1013,1450,1325,1768,1 989,2432,1277,2696,2 2738,965,3025,1253,1
data/dataset2/coins/P00524-151929.jpg 2678,295,3073,714,0 2211,558,2618,965,0 2450,1109,2822,1492,0 3265,1444,3648,1804,0 989,965,1373,1361,0 774,1552,1157,1911,0 1085,1995,1420,2313,0 1666,2121,2007,2450,0 1935,349,2247,678,1 1253,2355,1540,2618,1 1756,1540,2043,1792,2
data/dataset2/coins/P00524-151944.jpg 1444,355,1828,762,0 2402,343,2786,756,0 690,678,1085,1061,0 941,1193,1283,1540,0 1504,1265,1852,1588,0 1947,1241,2307,1564,0 2103,1660,2450,1959,0 989,2115,1283,2402,0 1756,2546,1995,2768,1 1001,2391,1277,2600,1 3049,917,3325,1193,2 1666,2091,1911,2307,2
data/dataset2/coins/P00524-151953.jpg 822,810,1205,1211,0 1283,642,1684,1037,0 1636,1049,2019,1397,0 1349,1349,1732,1660,0 1205,1911,1516,2211,0 1947,2097,2259,2379,0 2594,1067,2953,1420,0 2558,1738,2905,2049,0 3145,1349,3456,1660,1 774,2319,1049,2546,1 2450,630,2750,929,2 2211,1426,2474,1684,2
data/dataset2/coins/P00524-152004.jpg 1696,223,2145,702,0 2810,684,3289,1133,0 1301,1444,1684,1834,0 834,1522,1253,1899,0 1349,2391,1684,2714,0 2127,1013,2474,1349,1 630,1205,953,1540,1 2127,1468,2414,1744,2 1043,1888,1337,2139,2
data/dataset2/coins/P00524-152030.jpg 1211,732,1618,1073,0 2858,588,3229,917,0 2498,852,2888,1181,0 2187,1187,2618,1540,0 1756,1205,2175,1564,0 923,1540,1385,1971,0 666,2271,1085,2720,1 2043,798,2379,1061,1 1684,1995,2031,2367,2 3169,1492,3504,1792,2
data/dataset2/coins/P00524-152038.jpg 1780,355,2151,702,0 2714,726,3085,1085,0 1738,1229,2115,1612,0 2379,1876,2768,2283,0 1420,1876,1804,2259,0 1373,1397,1732,1762,0 1109,1235,1420,1540,1 2502,2299,2806,2635,1 2630,1229,2911,1504,2 2187,2385,2474,2690,2
data/dataset2/coins/P00524-152052.jpg 1977,582,2438,1025,0 2744,564,3193,1013,0 3241,1043,3708,1516,0 2570,989,3037,1432,0 2840,1738,3349,2235,0 2546,2241,3073,2768,0 1095,1307,1478,1684,1 1636,1289,2007,1684,1 1385,2067,1768,2474,2 2355,1756,2702,2097,2
data/dataset2/coins/P00524-152108.jpg 2235,229,2642,612,0 2534,702,2983,1109,0 2630,1193,3097,1636,0 2067,1426,2534,1876,0 2031,810,2474,1229,0 1929,2211,2402,2696,0 1522,612,1864,941,1 1636,1516,1995,1876,1 1636,1121,1947,1432,2 2474,1732,2810,2055,2
data/dataset2/coins/P00524-152122.jpg 708,714,1205,1091,0 1043,995,1540,1397,0 2786,852,3289,1259,0 1402,2163,2043,2822,0 2289,1732,2870,2283,0 2786,2067,3408,2690,0 2091,1277,2522,1636,1 756,1756,1229,2199,1 1756,738,2091,995,2 1612,1492,2019,1852,2
data/dataset2/coins/P00524-152144.jpg 438,1349,786,1732,0 1277,1109,1672,1468,0 1450,612,1828,947,0 1708,1061,2091,1444,0 1097,1804,1540,2235,0 2582,1636,3055,2091,0 2163,1013,2510,1307,1 941,1456,1259,1780,1 1684,1588,1983,1899,2 1947,1852,2265,2187,2
data/dataset2/coins/P00524-152155.jpg 1528,738,1953,1115,0 2870,941,3325,1337,0 2762,1426,3253,1876,0 2426,2295,2953,2840,0 1714,1379,2163,1804,0 1385,1732,1840,2187,0 810,1552,1163,1899,1 2295,1420,2666,1768,1 1474,253,1780,498,2 2163,307,2474,558,2
data/dataset2/coins/P00524-152208.jpg 822,498,1253,798,0 1253,684,1732,989,0 2450,965,2941,1325,0 989,1145,1516,1540,0 1756,1474,2313,1953,0 2019,1947,2666,2546,0 1157,1762,1684,2211,1 2834,1792,3337,2265,1 1720,606,2043,798,2 2073,714,2414,965,2
data/dataset2/coins/P00524-152213.jpg 678,2139,1325,2726,0 2379,1738,2929,2247,0 2894,1379,3397,1810,0 2169,1139,2648,1516,0 2534,702,2953,995,0 1504,618,1923,894,0 1337,1995,1828,2450,1 2642,1085,3001,1349,1 2283,822,2594,1061,2 1522,438,1804,606,2
data/dataset2/coins/P00524-152222.jpg 2365,544,2748,874,0 1025,1636,1420,2007,0 1660,1642,2043,2043,0 2283,1444,2690,1834,0 2570,1816,2977,2235,0 1690,2343,2115,2786,0 1876,1253,2187,1564,1 965,2313,1307,2666,1 1397,391,1684,630,2 1019,870,1277,1115,2
data/dataset2/coins/P00524-152244.jpg 1426,870,1852,1307,0 2426,564,2840,995,0 2690,1115,3121,1540,0 1402,1624,1804,2043,0 1905,1971,2331,2402,0 1814,1546,2149,1882,1 1331,2187,1684,2546,1 2498,1444,2816,1756,2 1133,1492,1450,1804,2
data/dataset2/coins/P00524-152253.jpg 1660,235,2031,630,0 2450,929,2810,1307,0 750,1516,1133,1888,0 1408,1397,1804,1780,0 2600,1301,2983,1666,0 2570,2385,2935,2786,0 1971,1642,2283,1947,1 2726,1828,3037,2139,1 1636,1145,1911,1432,2 2283,2067,2558,2343,2
data/dataset2/coins/P00524-152300.jpg 2331,379,2696,750,0 2199,756,2570,1109,0 1852,929,2211,1289,0 1540,1163,1899,1516,0 1109,1468,1498,1834,0 1899,1923,2265,2289,0 2846,1277,3157,1564,1 1911,1379,2211,1660,1 3121,1852,3385,2139,2 1283,1816,1564,2091,2
data/dataset2/coins/P00524-152329.jpg 1792,211,2223,654,0 1115,989,1570,1444,0 971,1684,1432,2175,0 1804,1612,2259,2073,0 2762,1552,3217,1977,0 2271,2402,2690,2840,0 2067,1109,2426,1468,1 1355,1397,1732,1756,1 2582,1253,2911,1588,2 2624,2019,2989,2343,2
data/dataset2/coins/P00524-152335.jpg 1720,247,2151,678,0 1133,1253,1540,1708,0 1780,1301,2187,1732,0 2259,1696,2666,2139,0 1037,1899,1450,2355,0 1876,2247,2307,2678,0 2139,1025,2480,1349,1 1516,1995,1882,2331,1 534,1349,846,1684,2 2917,1708,3265,2031,2
data/dataset2/coins/P00524-152343.jpg 1738,349,2139,750,0 810,654,1253,1085,0 582,1474,1061,1923,0 1840,1408,2271,1858,0 1642,1995,2115,2474,0 3289,1492,3702,1905,0 2163,905,2480,1229,1 1241,1061,1612,1408,1 953,1929,1325,2283,2 2576,870,2894,1205,2
data/dataset2/coins/P00524-152351.jpg 1259,343,1672,798,0 2642,582,3025,1013,0 3181,1504,3576,1882,0 2283,1576,2666,1971,0 1301,1313,1714,1744,0 397,2283,822,2690,0 1073,899,1444,1253,1 2025,1061,2367,1397,1 3325,1181,3624,1468,2 1720,1546,2019,1852,2
data/dataset2/coins/P00524-152401.jpg 1253,486,1684,923,0 2103,462,2546,894,0 1450,894,1876,1325,0 1947,1133,2367,1570,0 1905,1600,2337,2019,0 965,1905,1397,2355,0 1037,882,1397,1235,1 1037,1528,1402,1876,1 3181,630,3492,965,2 1540,2199,1858,2546,2
data/dataset2/coins/P00524-152409.jpg 1163,564,1660,1061,0 1923,103,2355,570,0 2313,349,2696,822,0 2666,804,3013,1241,0 965,1756,1474,2283,0 2534,1564,2905,2007,0 582,510,1037,965,1 1504,1444,1876,1828,1 2259,2211,2570,2546,2 3001,582,3265,894,2
coco_anchors.names
12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
coins.names
1yuan
5jiao
1jiao
测试图片:从数据集中抽出来的
# -*- coding: utf-8 -*-
#-------------------------------------#
# 对数据集进行训练
#-------------------------------------#
import os
import time
import numpy as np
import torch
import torch.backends.cudnn as cudnn
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
from torch.utils.data import DataLoader
from tqdm import tqdm
from yolo4 import YoloBody
from yolo_training import Generator, YOLOLoss
from dataloader import YoloDataset, yolo_dataset_collate
#---------------------------------------------------#
# 获得类和先验框
#---------------------------------------------------#
def get_classes(classes_path):
'''loads the classes'''
with open(classes_path) as f:
class_names = f.readlines()
class_names = [c.strip() for c in class_names]
return class_names
def get_anchors(anchors_path):
'''loads the anchors from a file'''
with open(anchors_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
return np.array(anchors).reshape([-1,3,2])[::-1,:,:]
def get_lr(optimizer):
for param_group in optimizer.param_groups:
return param_group['lr']
def fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,genval,Epoch,cuda):
total_loss = 0
val_loss = 0
net.train()
with tqdm(total=epoch_size,desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
for iteration, batch in enumerate(gen):
if iteration >= epoch_size:
break
images, targets = batch[0], batch[1]
with torch.no_grad():
if cuda:
images = Variable(torch.from_numpy(images).type(torch.FloatTensor)).cuda()
targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]
else:
images = Variable(torch.from_numpy(images).type(torch.FloatTensor))
targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]
#----------------------#
# 清零梯度
#----------------------#
optimizer.zero_grad()
#----------------------#
# 前向传播
#----------------------#
outputs = net(images)
losses = []
num_pos_all = 0
#----------------------#
# 计算损失
#----------------------#
for i in range(3):
loss_item, num_pos = yolo_losses[i](outputs[i], targets)
losses.append(loss_item)
num_pos_all += num_pos
loss = sum(losses) / num_pos_all
#----------------------#
# 反向传播
#----------------------#
loss.backward()
optimizer.step()
total_loss += loss.item()
pbar.set_postfix(**{'total_loss': total_loss / (iteration + 1), 'lr' : get_lr(optimizer)})
pbar.update(1)
net.eval()
print('Start Validation')
with tqdm(total=epoch_size_val, desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
for iteration, batch in enumerate(genval):
if iteration >= epoch_size_val:
break
images_val, targets_val = batch[0], batch[1]
with torch.no_grad():
if cuda:
images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor)).cuda()
targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
else:
images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor))
targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
optimizer.zero_grad()
outputs = net(images_val)
losses = []
num_pos_all = 0
for i in range(3):
loss_item, num_pos = yolo_losses[i](outputs[i], targets_val)
losses.append(loss_item)
num_pos_all += num_pos
loss = sum(losses) / num_pos_all
val_loss += loss.item()
pbar.set_postfix(**{'total_loss': val_loss / (iteration + 1)})
pbar.update(1)
print('Finish Validation')
print('Epoch:'+ str(epoch+1) + '/' + str(Epoch))
print('Total Loss: %.4f || Val Loss: %.4f ' % (total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
if (epoch+1)%20==0:
print('Saving state, iter:', str(epoch+1))
# torch.save(model.state_dict(), 'data/model3/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
torch.save(model, 'data/model1/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
if __name__ == "__main__":
Cuda = False
# Dataloder的使用
Use_Data_Loader = True
normalize = False
input_shape = (416,416)
anchors_path = 'data/dataset2/coco_anchors.names'
classes_path = 'data/dataset2/coins.names'
# 获取classes和anchor
class_names = get_classes(classes_path)
anchors = get_anchors(anchors_path)
num_classes = len(class_names)
print("class_num", num_classes)
#------------------------------------------------------#
mosaic = False # mosaic 马赛克数据增强, 实际测试时mosaic数据增强并不稳定,所以默认为False
Cosine_lr = False # Cosine_scheduler 余弦退火学习率 True or False
smoooth_label = 0 # label_smoothing 标签平滑 0.01以下一般 如0.01、0.005
#------------------------------------------------------#
model_path = "data/model/yolo4_weights.pth"
print('Loading weights into state dict...')
model = YoloBody(len(anchors[0]), num_classes)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model_dict = model.state_dict()
pretrained_dict = torch.load(model_path, map_location=device)
# 因为换了upsample,权重加载时会找不到节点,因此这里需要排除upsample的权重加载
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k.find('upsample')==-1 if np.shape(model_dict[k]) == np.shape(v)}
# #原始加载权重方法
# pretrained_dict = {k: v for k, v in pretrained_dict.items() if np.shape(model_dict[k]) == np.shape(v)}
model_dict.update(pretrained_dict)
model.load_state_dict(model_dict)
# print(model)
net = model.train()
if Cuda:
net = torch.nn.DataParallel(model)
cudnn.benchmark = True
net = net.cuda()
# 建立loss函数
yolo_losses = []
for i in range(3):
yolo_losses.append(YOLOLoss(np.reshape(anchors,[-1,2]),num_classes, \
(input_shape[1], input_shape[0]), smoooth_label, Cuda, normalize))
#-----------------dataset------------------------#
annotation_path = 'data/dataset2/coins/train.txt'
val_split = 0.1
with open(annotation_path) as f:
lines = f.readlines()
np.random.seed(10101)
np.random.shuffle(lines)
np.random.seed(None)
num_val = int(len(lines)*val_split)
num_train = len(lines) - num_val
#------------------------------------------------------#
# 主干特征提取网络特征通用,冻结训练可以加快训练速度
# 也可以在训练初期防止权值被破坏。
# Init_Epoch为起始世代
# Freeze_Epoch为冻结训练的世代
# Epoch总训练世代
# 提示OOM或者显存不足请调小Batch_size
#------------------------------------------------------#
if True:
lr = 1e-3
Batch_size = 2
Init_Epoch = 0
Freeze_Epoch = 200
optimizer = optim.Adam(net.parameters(),lr)
if Cosine_lr:
lr_scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=5, eta_min=1e-5)
else:
lr_scheduler = optim.lr_scheduler.StepLR(optimizer,step_size=1,gamma=0.92)
if Use_Data_Loader:
train_dataset = YoloDataset(lines[:num_train], (input_shape[0], input_shape[1]), mosaic=mosaic, is_train=True)
val_dataset = YoloDataset(lines[num_train:], (input_shape[0], input_shape[1]), mosaic=False, is_train=False)
gen = DataLoader(train_dataset, shuffle=True, batch_size=Batch_size, num_workers=4, pin_memory=True,
drop_last=True, collate_fn=yolo_dataset_collate)
gen_val = DataLoader(val_dataset, shuffle=True, batch_size=Batch_size, num_workers=4,pin_memory=True,
drop_last=True, collate_fn=yolo_dataset_collate)
else:
gen = Generator(Batch_size, lines[:num_train],
(input_shape[0], input_shape[1])).generate(train=True, mosaic = mosaic)
gen_val = Generator(Batch_size, lines[num_train:],
(input_shape[0], input_shape[1])).generate(train=False, mosaic = mosaic)
#------------------------------------#
# 冻结一定部分训练
#------------------------------------#
for param in model.backbone.parameters():
param.requires_grad = True
epoch_size = max(1, num_train//Batch_size)
epoch_size_val = num_val//Batch_size
for epoch in range(Init_Epoch,Freeze_Epoch):
fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,gen_val,Freeze_Epoch,Cuda)
lr_scheduler.step()
# -*- coding: utf-8 -*-
#-------------------------------------#
# 对数据集进行训练
#-------------------------------------#
import os
import time
import numpy as np
import torch
import torch.backends.cudnn as cudnn
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
from torch.utils.data import DataLoader
from tqdm import tqdm
from yolo4 import YoloBody
from yolo_training import Generator, YOLOLoss
from dataloader import YoloDataset, yolo_dataset_collate
#---------------------------------------------------#
# 获得类和先验框
#---------------------------------------------------#
def get_classes(classes_path):
'''loads the classes'''
with open(classes_path) as f:
class_names = f.readlines()
class_names = [c.strip() for c in class_names]
return class_names
def get_anchors(anchors_path):
'''loads the anchors from a file'''
with open(anchors_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
return np.array(anchors).reshape([-1,3,2])[::-1,:,:]
def get_lr(optimizer):
for param_group in optimizer.param_groups:
return param_group['lr']
def fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,genval,Epoch,cuda):
total_loss = 0
val_loss = 0
net.train()
with tqdm(total=epoch_size,desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
for iteration, batch in enumerate(gen):
if iteration >= epoch_size:
break
images, targets = batch[0], batch[1]
with torch.no_grad():
if cuda:
images = Variable(torch.from_numpy(images).type(torch.FloatTensor)).cuda()
targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]
else:
images = Variable(torch.from_numpy(images).type(torch.FloatTensor))
targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]
#----------------------#
# 清零梯度
#----------------------#
optimizer.zero_grad()
#----------------------#
# 前向传播
#----------------------#
outputs = net(images)
losses = []
num_pos_all = 0
#----------------------#
# 计算损失
#----------------------#
for i in range(3):
loss_item, num_pos = yolo_losses[i](outputs[i], targets)
losses.append(loss_item)
num_pos_all += num_pos
loss = sum(losses) / num_pos_all
#----------------------#
# 反向传播
#----------------------#
loss.backward()
optimizer.step()
total_loss += loss.item()
pbar.set_postfix(**{'total_loss': total_loss / (iteration + 1), 'lr' : get_lr(optimizer)})
pbar.update(1)
net.eval()
print('Start Validation')
with tqdm(total=epoch_size_val, desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
for iteration, batch in enumerate(genval):
if iteration >= epoch_size_val:
break
images_val, targets_val = batch[0], batch[1]
with torch.no_grad():
if cuda:
images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor)).cuda()
targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
else:
images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor))
targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
optimizer.zero_grad()
outputs = net(images_val)
losses = []
num_pos_all = 0
for i in range(3):
loss_item, num_pos = yolo_losses[i](outputs[i], targets_val)
losses.append(loss_item)
num_pos_all += num_pos
loss = sum(losses) / num_pos_all
val_loss += loss.item()
pbar.set_postfix(**{'total_loss': val_loss / (iteration + 1)})
pbar.update(1)
print('Finish Validation')
print('Epoch:'+ str(epoch+1) + '/' + str(Epoch))
print('Total Loss: %.4f || Val Loss: %.4f ' % (total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
if (epoch+1)%20==0:
print('Saving state, iter:', str(epoch+1))
# torch.save(model.state_dict(), 'data/model3/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
torch.save(model, 'data/model1/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
if __name__ == "__main__":
Cuda = False
# Dataloder的使用
Use_Data_Loader = True
normalize = False
input_shape = (416,416)
anchors_path = 'data/dataset2/coco_anchors.names'
classes_path = 'data/dataset2/coins.names'
# 获取classes和anchor
class_names = get_classes(classes_path)
anchors = get_anchors(anchors_path)
num_classes = len(class_names)
print("class_num", num_classes)
#------------------------------------------------------#
mosaic = False # mosaic 马赛克数据增强, 实际测试时mosaic数据增强并不稳定,所以默认为False
Cosine_lr = False # Cosine_scheduler 余弦退火学习率 True or False
smoooth_label = 0 # label_smoothing 标签平滑 0.01以下一般 如0.01、0.005
#------------------------------------------------------#
model_path = "data/model/yolo4_weights.pth"
print('Loading weights into state dict...')
model = YoloBody(len(anchors[0]), num_classes)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model_dict = model.state_dict()
pretrained_dict = torch.load(model_path, map_location=device)
# 因为换了upsample,权重加载时会找不到节点,因此这里需要排除upsample的权重加载
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k.find('upsample')==-1 if np.shape(model_dict[k]) == np.shape(v)}
# #原始加载权重方法
# pretrained_dict = {k: v for k, v in pretrained_dict.items() if np.shape(model_dict[k]) == np.shape(v)}
model_dict.update(pretrained_dict)
model.load_state_dict(model_dict)
# print(model)
net = model.train()
if Cuda:
net = torch.nn.DataParallel(model)
cudnn.benchmark = True
net = net.cuda()
# 建立loss函数
yolo_losses = []
for i in range(3):
yolo_losses.append(YOLOLoss(np.reshape(anchors,[-1,2]),num_classes, \
(input_shape[1], input_shape[0]), smoooth_label, Cuda, normalize))
#-----------------dataset------------------------#
annotation_path = 'data/dataset2/coins/train.txt'
val_split = 0.1
with open(annotation_path) as f:
lines = f.readlines()
np.random.seed(10101)
np.random.shuffle(lines)
np.random.seed(None)
num_val = int(len(lines)*val_split)
num_train = len(lines) - num_val
#------------------------------------------------------#
# 主干特征提取网络特征通用,冻结训练可以加快训练速度
# 也可以在训练初期防止权值被破坏。
# Init_Epoch为起始世代
# Freeze_Epoch为冻结训练的世代
# Epoch总训练世代
# 提示OOM或者显存不足请调小Batch_size
#------------------------------------------------------#
if True:
lr = 1e-3
Batch_size = 2
Init_Epoch = 0
Freeze_Epoch = 200
optimizer = optim.Adam(net.parameters(),lr)
if Cosine_lr:
lr_scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=5, eta_min=1e-5)
else:
lr_scheduler = optim.lr_scheduler.StepLR(optimizer,step_size=1,gamma=0.92)
if Use_Data_Loader:
train_dataset = YoloDataset(lines[:num_train], (input_shape[0], input_shape[1]), mosaic=mosaic, is_train=True)
val_dataset = YoloDataset(lines[num_train:], (input_shape[0], input_shape[1]), mosaic=False, is_train=False)
gen = DataLoader(train_dataset, shuffle=True, batch_size=Batch_size, num_workers=4, pin_memory=True,
drop_last=True, collate_fn=yolo_dataset_collate)
gen_val = DataLoader(val_dataset, shuffle=True, batch_size=Batch_size, num_workers=4,pin_memory=True,
drop_last=True, collate_fn=yolo_dataset_collate)
else:
gen = Generator(Batch_size, lines[:num_train],
(input_shape[0], input_shape[1])).generate(train=True, mosaic = mosaic)
gen_val = Generator(Batch_size, lines[num_train:],
(input_shape[0], input_shape[1])).generate(train=False, mosaic = mosaic)
#------------------------------------#
# 冻结一定部分训练
#------------------------------------#
for param in model.backbone.parameters():
param.requires_grad = True
epoch_size = max(1, num_train//Batch_size)
epoch_size_val = num_val//Batch_size
for epoch in range(Init_Epoch,Freeze_Epoch):
fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,gen_val,Freeze_Epoch,Cuda)
lr_scheduler.step()
from yolo4 import YoloBody
import torch
from PIL import Image
from torchvision import transforms
import cv2
import numpy as np
from utils import (DecodeBox, bbox_iou, letterbox_image,non_max_suppression, yolo_correct_boxes)
import os
import colorsys
from PIL import Image, ImageDraw, ImageFont
def get_class(classes_path):
classes_path = os.path.expanduser(classes_path)
with open(classes_path) as f:
class_names = f.readlines()
class_names = [c.strip() for c in class_names]
return class_names
def get_anchors(anchors_path):
anchors_path = os.path.expanduser(anchors_path)
with open(anchors_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
return np.array(anchors).reshape([-1, 3, 2])[::-1,:,:]
def get_letterbox_image(image, size):
iw, ih = image.size
w, h = size
scale = min(w/iw, h/ih)
nw = int(iw*scale)
nh = int(ih*scale)
image = image.resize((nw,nh), Image.BICUBIC)
new_image = Image.new('RGB', size, (128,128,128))
new_image.paste(image, ((w-nw)//2, (h-nh)//2))
return new_image
confidence=0.5
letterbox_image=False
anchors_path='data/dataset2/coco_anchors.names'
classes_path='data/dataset2/coins.names'
model_path="data/model1/test.pth"
class_names = get_class(classes_path)
# 画框设置不同的颜色
hsv_tuples = [(x / len(class_names), 1., 1.)
for x in range(len(class_names))]
colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),colors))
def result(outputs, image):
# 模型后处理
output_list = []
for i in range(3):
print(type(outputs[i]),outputs[i].shape)
decodeBox=DecodeBox(get_anchors(anchors_path)[i], len(class_names), (416, 416))
output_list.append(decodeBox(outputs[i]))
print(outputs[i].size(),decodeBox(outputs[i]).shape)
output = torch.cat(output_list, 1)
batch_detections = non_max_suppression(output, len(class_names), conf_thres=confidence, nms_thres=0.3)
print(output.shape,batch_detections)
try:
batch_detections = batch_detections[0].cpu().numpy()
except:
return image
# 检测框处理
top_index = batch_detections[:,4] * batch_detections[:,5] > confidence
top_conf = batch_detections[top_index,4]*batch_detections[top_index,5]
top_label = np.array(batch_detections[top_index,-1],np.int32)
top_bboxes = np.array(batch_detections[top_index,:4])
top_xmin, top_ymin, top_xmax, top_ymax = np.expand_dims(top_bboxes[:,0],-1),np.expand_dims(top_bboxes[:,1],-1),np.expand_dims(top_bboxes[:,2],-1),np.expand_dims(top_bboxes[:,3],-1)
#-----------------------------------------------------------------#
# 在图像传入网络预测前会进行letterbox_image给图像周围添加灰条
# 因此生成的top_bboxes是相对于有灰条的图像的
# 我们需要对其进行修改,去除灰条的部分。
#-----------------------------------------------------------------#
image_shape = np.array(np.shape(image)[0:2])
if letterbox_image:
boxes = yolo_correct_boxes(top_ymin,top_xmin,top_ymax,top_xmax,np.array([416,416]),image_shape)
else:
top_xmin = top_xmin / 416 * image_shape[1]
top_ymin = top_ymin / 416 * image_shape[0]
top_xmax = top_xmax / 416 * image_shape[1]
top_ymax = top_ymax / 416 * image_shape[0]
boxes = np.concatenate([top_ymin,top_xmin,top_ymax,top_xmax], axis=-1)
# font = ImageFont.truetype(font='/usr/share/fonts/truetype/lyx/cmr10.ttf',size=np.floor(3e-2 * np.shape(image)[1] + 0.5).astype('int32'))
font = ImageFont.truetype(font='data/simhei.ttf',size=np.floor(3e-2 * np.shape(image)[1] + 0.5).astype('int32'))
thickness = max((np.shape(image)[0] + np.shape(image)[1]) // 416, 1)
for i, c in enumerate(top_label):
predicted_class = class_names[c]
score = top_conf[i]
top, left, bottom, right = boxes[i]
top = top - 5
left = left - 5
bottom = bottom + 5
right = right + 5
top = max(0, np.floor(top + 0.5).astype('int32'))
left = max(0, np.floor(left + 0.5).astype('int32'))
bottom = min(np.shape(image)[0], np.floor(bottom + 0.5).astype('int32'))
right = min(np.shape(image)[1], np.floor(right + 0.5).astype('int32'))
# 画框框
label = '{} {:.2f}'.format(predicted_class, score)
draw = ImageDraw.Draw(image)
label_size = draw.textsize(label, font)
label = label.encode('utf-8')
print(label, top, left, bottom, right)
if top - label_size[1] >= 0:
text_origin = np.array([left, top - label_size[1]])
else:
text_origin = np.array([left, top + 1])
for i in range(thickness):
draw.rectangle(
[left + i, top + i, right - i, bottom - i],
outline=colors[class_names.index(predicted_class)])
draw.rectangle(
[tuple(text_origin), tuple(text_origin + label_size)],
fill=colors[class_names.index(predicted_class)])
draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)
# image.show()
def prediect(img):
# 模型加载
device = torch.device('cpu')
model=torch.load(model_path)
model=model.to(device)
# 模型预测
# img = torch.from_numpy(img)
img = torch.tensor(img, dtype=torch.float32)
torch.no_grad()
outputs = model(img)
return outputs
def get_imgges(image):
# 数据处理
image_shape = np.array(np.shape(image)[0:2])
print(type(image))
if letterbox_image:
crop_img = np.array(get_letterbox_image(image, (416,416)))
else:
crop_img = image.convert('RGB')
crop_img = crop_img.resize((416,416), Image.BICUBIC)
photo = np.array(crop_img,dtype = np.float32) / 255.0
photo = np.transpose(photo, (2, 0, 1))
img = [photo]
img=np.asarray(img)
return img
if __name__ == '__main__':
img_path="data/img/test1.jpg"
image = Image.open(img_path)
img=get_imgges(image)
outputs=prediect(img)
result(outputs, image)
import os
from PIL import Image, ImageDraw, ImageFont
from post_process import *
img_path="data/img/test1.jpg"
image = Image.open(img_path)
img=get_imgges(image)
outputs=prediect(img)
result(outputs, image)
print(outputs[0].shape, outputs[1].shape, outputs[2].shape)
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load("data/model1/test.pth") # pytorch模型加载
model.eval()
input_shape=list(map(int, "1,3,416,416".split(",")))
x = torch.randn(input_shape) # 生成张量
x = x.to(device)
export_onnx_file = "data/model1/test.onnx" # 目的ONNX文件名
#torch.onnx.export(model, x, export_onnx_file, verbose=True)
torch.onnx.export(model, x, export_onnx_file, verbose=True, export_params=True, do_constant_folding=True, opset_version=11)
import cv2
import numpy as np
import onnxruntime as rt
import torch
from PIL import Image
from post_process import *
def to_numpy(tensor):
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
def onnx_runtime(img):
sess = rt.InferenceSession("data/model1/test.onnx")
inputs = {sess.get_inputs()[0].name: img}
output = sess.run(None, inputs)
outputs=[]
for i in range(3):
outputs.append(torch.from_numpy(output[i]))
outputs=tuple(outputs)
return outputs
img_path="data/img/test1.jpg"
image = Image.open(img_path)
img=get_imgges(image)
print(img.shape)
outputs=onnx_runtime(img)
result(outputs, image)
print(type(outputs), outputs[0].shape, outputs[1].shape, outputs[2].shape)
import numpy as np
import torch
import onnx
import onnxruntime as rt
import pickle
# 测试数据
x = torch.randn(1,3,416,416, requires_grad=False)
# 使用 ONNX 的 API 检查 ONNX 模型
onnx_model = onnx.load("data/model1/test.onnx")
onnx.checker.check_model(onnx_model)
# onnx模型测试
sess = rt.InferenceSession("data/model1/test.onnx")
def to_numpy(tensor):
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
#结果输出
ort_inputs = {sess.get_inputs()[0].name: to_numpy(x)}
ort_outs = sess.run(None, ort_inputs)
print(x.shape, ort_outs[0].shape)
# torch模型测试
model=torch.load("data/model1/test.pth",map_location='cpu')
model.eval()
torch_out = model(x)
print(x.shape, torch_out[0].shape)
# 比较ONNX 和 PyTorch 的结果
np.testing.assert_allclose(to_numpy(torch_out[0]), ort_outs[0], rtol=1e-03, atol=1e-05)
print("模型没有太大差异!")
这里改用yolov3的后处理方式,把推理结果从大中小三个输出框截断,然后对接decode层和nms层,参考:https://blog.csdn.net/gm_Ergou/article/details/118573834
import cv2
import numpy as np
import os
import colorsys
from PIL import Image, ImageDraw, ImageFont
def get_class(classes_path):
classes_path = os.path.expanduser(classes_path)
with open(classes_path) as f:
class_names = f.readlines()
class_names = [c.strip() for c in class_names]
return class_names
def get_anchors(anchors_path):
anchors_path = os.path.expanduser(anchors_path)
with open(anchors_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
return np.array(anchors).reshape([-1, 3, 2])[::-1,:,:]
def sigmoid(x):
x_ravel = x.ravel() # 将numpy数组展平
length = len(x_ravel)
y = []
for index in range(length):
if x_ravel[index] >= 0:
y.append(1.0 / (1 + np.exp(-x_ravel[index])))
else:
y.append(np.exp(x_ravel[index]) / (np.exp(x_ravel[index]) + 1))
return np.array(y).reshape(x.shape)
def letterbox_image(image, size):
iw, ih = image.size
w, h = size
scale = min(w/iw, h/ih)
nw = int(iw*scale)
nh = int(ih*scale)
image = image.resize((nw,nh), Image.BICUBIC)
new_image = Image.new('RGB', size, (128,128,128))
new_image.paste(image, ((w-nw)//2, (h-nh)//2))
return new_image
# 数据处理
def get_imgges(image, letterbox):
if letterbox:
crop_img = np.array(letterbox_image(image, (416,416)))
else:
crop_img = image.convert('RGB')
crop_img = crop_img.resize((416,416), Image.BICUBIC)
photo = np.array(crop_img,dtype = np.float32) / 255.0
photo = np.transpose(photo, (2, 0, 1))
img = [photo]
img=np.asarray(img)
return img
def DecodeBox2(anchors, num_classes, img_size, input):
anchors = anchors
num_anchors = len(anchors)
num_classes = num_classes
bbox_attrs = 5 + num_classes
img_size = img_size
batch_size = input.shape[0]
input_height = input.shape[2]
input_width = input.shape[3]
# print(batch_size, input_height, input_width, input.shape)
stride_h = img_size[1] / input_height
stride_w = img_size[0] / input_width
scaled_anchors = [(anchor_width / stride_w, anchor_height / stride_h) for anchor_width, anchor_height in anchors]
# prediction = input.view(batch_size, num_anchors, bbox_attrs, input_height, input_width).permute(0, 1, 3, 4, 2).contiguous()
a = input.reshape(batch_size, num_anchors, bbox_attrs, input_height, input_width).transpose(0, 1, 3, 4, 2)
prediction = np.copy(a)
# print(prediction, prediction.shape)
# 先验框的中心位置的调整参数
x = sigmoid(prediction[..., 0])
y = sigmoid(prediction[..., 1])
# 先验框的宽高调整参数
w = prediction[..., 2]
h = prediction[..., 3]
# 获得置信度,是否有物体
conf = sigmoid(prediction[..., 4])
# 种类置信度
pred_cls = sigmoid(prediction[..., 5:])
# 生成网格,先验框中心,网格左上角
grid_x = np.linspace(0, input_width - 1, input_width)
grid_x = np.tile(np.tile(grid_x, (input_height, 1)), (batch_size * num_anchors, 1, 1))
grid_x = grid_x.reshape(x.shape).astype(np.float16)
grid_y = np.linspace(0, input_height - 1, input_height)
grid_y = np.tile(np.tile(grid_y, (input_width, 1)).T, (batch_size * num_anchors, 1, 1))
grid_y = grid_y.reshape(y.shape).astype(np.float16)
# print(grid_y, grid_y.shape)
# # 按照网格格式生成先验框的宽高
anchor_w = np.array(scaled_anchors).astype(np.float16)[:,0].reshape(len(scaled_anchors),1) # len(scaled_anchors)=3
anchor_h = np.array(scaled_anchors).astype(np.float16)[:,1].reshape(len(scaled_anchors),1)
anchor_w = np.tile(np.tile(anchor_w, (batch_size, 1)), (1, 1, input_height * input_width)).reshape(w.shape)
anchor_h = np.tile(np.tile(anchor_h, (batch_size, 1)), (1, 1, input_height * input_width)).reshape(h.shape)
# print(anchor_w,anchor_h)
# print(anchor_w.shape, anchor_h.shape)
#----------------------------------------------------------#
# 利用预测结果对先验框进行调整
# 首先调整先验框的中心,从先验框中心向右下角偏移
# 再调整先验框的宽高。
#----------------------------------------------------------#
pred_boxes = np.zeros(shape=prediction[..., :4].shape)
pred_boxes[..., 0] = x.data + grid_x
pred_boxes[..., 1] = y.data + grid_y
pred_boxes[..., 2] = np.exp(w.data) * anchor_w
pred_boxes[..., 3] = np.exp(h.data) * anchor_h
# print(pred_boxes)
#----------------------------------------------------------#
# 将输出结果调整成相对于输入图像大小
#----------------------------------------------------------#
_scale=np.array([stride_w, stride_h] * 2).astype(np.float16)
output = np.concatenate((pred_boxes.reshape(batch_size, -1, 4) * _scale,
conf.reshape(batch_size, -1, 1), pred_cls.reshape(batch_size, -1, num_classes)), -1)
return output
def yolo_correct_boxes(top, left, bottom, right, input_shape, image_shape):
new_shape = image_shape*np.min(input_shape/image_shape)
offset = (input_shape-new_shape)/2./input_shape
scale = input_shape/new_shape
box_yx = np.concatenate(((top+bottom)/2,(left+right)/2),axis=-1)/input_shape
box_hw = np.concatenate((bottom-top,right-left),axis=-1)/input_shape
box_yx = (box_yx - offset) * scale
box_hw *= scale
box_mins = box_yx - (box_hw / 2.)
box_maxes = box_yx + (box_hw / 2.)
boxes = np.concatenate([
box_mins[:, 0:1],
box_mins[:, 1:2],
box_maxes[:, 0:1],
box_maxes[:, 1:2]
],axis=-1)
boxes *= np.concatenate([image_shape, image_shape],axis=-1)
return boxes
def bbox_iou2(box1, box2, x1y1x2y2=True):
"""
计算IOU
"""
if not x1y1x2y2:
b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0] + box1[:, 2] / 2
b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1] + box1[:, 3] / 2
b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2
b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2
else:
b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3]
b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3]
inter_rect_x1 = np.maximum(b1_x1, b2_x1)
inter_rect_y1 = np.maximum(b1_y1, b2_y1)
inter_rect_x2 = np.minimum(b1_x2, b2_x2)
inter_rect_y2 = np.minimum(b1_y2, b2_y2)
data1=inter_rect_x2 - inter_rect_x1 + 1
data2=inter_rect_y2 - inter_rect_y1 + 1
inter_area = np.clip(data1, a_min=0, a_max=max(data1)) * np.clip(data2, a_min=0, a_max=max(data2))
b1_area = (b1_x2 - b1_x1 + 1) * (b1_y2 - b1_y1 + 1)
b2_area = (b2_x2 - b2_x1 + 1) * (b2_y2 - b2_y1 + 1)
iou = inter_area / (b1_area + b2_area - inter_area + 1e-16)
return iou
def non_max_suppression2(prediction, num_classes, conf_thres=0.5, nms_thres=0.3):
box_corner = np.zeros(shape=prediction.shape)
box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2
box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2
box_corner[:, :, 2] = prediction[:, :, 0] + prediction[:, :, 2] / 2
box_corner[:, :, 3] = prediction[:, :, 1] + prediction[:, :, 3] / 2
prediction[:, :, :4] = box_corner[:, :, :4]
output = [None for _ in range(len(prediction))]
for image_i, image_pred in enumerate(prediction):
data=image_pred[:, 5:5 + num_classes]
class_conf=np.max(data, axis=1).reshape(len(data),1)
class_pred=data.argmax(axis=1).reshape(len(data),1)
#----------------------------------------------------------#
# 利用置信度进行第一轮筛选
#----------------------------------------------------------#
conf_mask = (image_pred[:, 4] * class_conf[:, 0] >= conf_thres).squeeze()
#----------------------------------------------------------#
# 根据置信度进行预测结果的筛选
#----------------------------------------------------------#
image_pred = image_pred[conf_mask]
class_conf = class_conf[conf_mask]
class_pred = class_pred[conf_mask]
if len(image_pred)<=0:
continue
# detections [num_anchors, 7] 7的内容为:x1, y1, x2, y2, obj_conf, class_conf, class_pred
detections = np.concatenate((image_pred[:, :5], class_conf.astype(np.float16), class_pred.astype(np.float16)), 1)
# 获得预测结果中包含的所有种类
unique_labels = np.unique(detections[:, -1])
for c in unique_labels:
detections_class = detections[detections[:, -1] == c]
# # 按照存在物体的置信度排序
conf_sort_index = np.argsort(-(detections_class[:, 4]*detections_class[:, 5]), axis=0)
detections_class = detections_class[conf_sort_index]
# 进行非极大抑制
max_detections = []
while detections_class.shape[0]>0:
# 取出这一类置信度最高的,一步一步往下判断,判断重合程度是否大于nms_thres,如果是则去除掉
max_detections.append(np.expand_dims(detections_class[0],axis=0))
if len(detections_class) == 1:
break
ious = bbox_iou2(max_detections[-1], detections_class[1:])
detections_class = detections_class[1:][ious < nms_thres]
# 堆叠
max_detections = np.concatenate(max_detections)
# Add max detections to outputs
output[image_i] = max_detections if output[image_i] is None else np.concatenate((output[image_i], max_detections))
return output
def prediect(img):
# 模型加载
device = torch.device('cpu')
model=torch.load(model_path)
model=model.to(device)
# 模型预测
# img = torch.from_numpy(img)
img = torch.tensor(img, dtype=torch.float32)
torch.no_grad()
outputs = model(img)
return outputs
def Regression(batch_detections, confidence, image, letterbox):
# 检测框处理
top_index = batch_detections[:,4] * batch_detections[:,5] > confidence
top_conf = batch_detections[top_index,4]*batch_detections[top_index,5]
top_label = np.array(batch_detections[top_index,-1],np.int32)
top_bboxes = np.array(batch_detections[top_index,:4])
top_xmin, top_ymin, top_xmax, top_ymax = np.expand_dims(top_bboxes[:,0],-1),np.expand_dims(top_bboxes[:,1],-1),np.expand_dims(top_bboxes[:,2],-1),np.expand_dims(top_bboxes[:,3],-1)
#-----------------------------------------------------------------#
image_shape = np.array(np.shape(image)[0:2])
if letterbox:
boxes = yolo_correct_boxes(top_ymin,top_xmin,top_ymax,top_xmax,np.array([416,416]),image_shape)
else:
top_xmin = top_xmin / 416 * image_shape[1]
top_ymin = top_ymin / 416 * image_shape[0]
top_xmax = top_xmax / 416 * image_shape[1]
top_ymax = top_ymax / 416 * image_shape[0]
boxes = np.concatenate([top_ymin,top_xmin,top_ymax,top_xmax], axis=-1)
return boxes.astype(np.int), top_conf, top_label
def draw_box(boxes, top_conf, top_label, class_names, image):
font = ImageFont.truetype(font='model_data/simhei.ttf',size=np.floor(3e-2 * np.shape(image)[1] + 0.5).astype('int32'))
thickness = max((np.shape(image)[0] + np.shape(image)[1]) // 416, 1)
# 画框设置不同的颜色
hsv_tuples = [(x / len(class_names), 1., 1.)
for x in range(len(class_names))]
colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),colors))
for i, c in enumerate(top_label):
predicted_class = class_names[c]
score = top_conf[i]
top, left, bottom, right = boxes[i]
top = top - 5
left = left - 5
bottom = bottom + 5
right = right + 5
top = max(0, np.floor(top + 0.5).astype('int32'))
left = max(0, np.floor(left + 0.5).astype('int32'))
bottom = min(np.shape(image)[0], np.floor(bottom + 0.5).astype('int32'))
right = min(np.shape(image)[1], np.floor(right + 0.5).astype('int32'))
# 画框框
label = '{} {:.2f}'.format(predicted_class, score)
draw = ImageDraw.Draw(image)
label_size = draw.textsize(label, font)
label = label.encode('utf-8')
print(label, top, left, bottom, right)
if top - label_size[1] >= 0:
text_origin = np.array([left, top - label_size[1]])
else:
text_origin = np.array([left, top + 1])
for i in range(thickness):
draw.rectangle(
[left + i, top + i, right - i, bottom - i],
outline=colors[class_names.index(predicted_class)])
draw.rectangle(
[tuple(text_origin), tuple(text_origin + label_size)],
fill=colors[class_names.index(predicted_class)])
draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)
# image.show()
import sys
import onnx
import os
import argparse
import numpy as np
import cv2
import onnxruntime
import torch
import colorsys
from PIL import Image, ImageDraw, ImageFont
import post_process2 as post_process
def letterbox_image2(image, size, letterbox):
# INTER_NEAREST:最邻近插值,INTER_LINEAR:双线性插值,INTER_CUBIC:4x4像素邻域内的双立方插值,INTER_LANCZOS4:8x8像素邻域内的Lanczos插值
if letterbox:
ih, iw = image.shape[0:2]
w, h = size
scale = min(w/iw, h/ih)
nw = int(iw*scale)
nh = int(ih*scale)
image = cv2.resize(image, (nw,nh), interpolation=cv2.INTER_LINEAR)
img = np.ones((w, h,3),dtype=np.uint8)
img[:,:]=128
img[(h-nh)//2:(h-nh)//2+nh, (w-nw)//2:(w-nw)//2+nw]=image
else:
img = cv2.resize(image, size, interpolation=cv2.INTER_LINEAR)
# cv2.imshow('img',img)
# cv2.waitKey(0)
return img
def letterbox_image(image, size):
iw, ih = image.size
w, h = size
scale = min(w/iw, h/ih)
nw = int(iw*scale)
nh = int(ih*scale)
image = image.resize((nw,nh), Image.BICUBIC)
new_image = Image.new('RGB', size, (128,128,128))
new_image.paste(image, ((w-nw)//2, (h-nh)//2))
# new_image.show()
return new_image
if __name__ == '__main__':
# 参数
conf_thres=0.5
nms_thres=0.3
anchors_path='data/dataset2/coco_anchors.names'
classes_path='data/dataset2/coins.names'
image_path="data/img/test1.jpg"
weight_file='data/model1/test.pth'
onnx_file_name = 'data/model1/test.onnx'
# 备注:img1是工程预处理,img2是自己写的,img3是atlas的om模型输入数据
# letterbox=True时,img1=img2!=img3,letterbox=False时,img2=img3!=img1 (由于cv和PIL的resize不一样,有小误差)
# img1:原代码预处理
letterbox=True
image_src = cv2.imread(image_path)
img1 = cv2.cvtColor(image_src, cv2.COLOR_BGR2RGB)
img1 = letterbox_image2(img1, (416,416), letterbox)
img1 = np.transpose(img1, (2, 0, 1)).astype(np.float32) / 255.0
img1 = np.expand_dims(img1, axis=0)
print(img1.shape)
# img2:自己写的预处理,参考yolov3的
image_src2 = Image.open(image_path)
if letterbox:
crop_img = np.array(letterbox_image(image_src2, (416,416)))
else:
crop_img = image_src2.convert('RGB')
crop_img = crop_img.resize((416,416), Image.BILINEAR) #NEAREST:最低质量,BILINEAR:双线性,BICUBIC:三次样条插值,ANTIALIAS:最高质量
photo = np.array(crop_img,dtype = np.float32) / 255.0
photo = np.transpose(photo, (2, 0, 1))
img2 = np.expand_dims(photo, axis=0)
print(img2.shape)
# img3: om模型的数据输入,atc转换时截断到input层得到的数据
img3=np.load("data/data2/input.npy")
print(img3.shape)
# Compute
session = onnxruntime.InferenceSession(onnx_file_name)
input_name = session.get_inputs()[0].name
outputs = session.run(None, {input_name: img3})
# print(len(outputs))
conv_sbbox=outputs[0]
conv_mbbox=outputs[1]
conv_lbbox=outputs[2]
input_size=(416, 416)
class_names = post_process.get_class(classes_path)
decode_sbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[0], len(class_names), input_size, conv_sbbox)
decode_mbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[1], len(class_names), input_size, conv_mbbox)
decode_lbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[2], len(class_names), input_size, conv_lbbox)
output = np.concatenate([decode_sbbox, decode_mbbox, decode_lbbox], 1)
print(decode_sbbox.shape, decode_mbbox.shape, decode_lbbox.shape, output.shape)
batch_detections = post_process.non_max_suppression2(output, len(class_names), conf_thres=conf_thres, nms_thres=nms_thres)
try:
batch_detections = np.array(batch_detections[0])
bbox_nums=np.array(batch_detections[0]).shape[0]
except:
print("没有检测结果!")
exit()
image = Image.open(image_path)
boxes, top_conf, top_label=post_process.Regression(batch_detections, conf_thres, image, letterbox)
post_process.draw_box(boxes, top_conf, top_label, class_names, image)
import cv2
import numpy as np
import os
import colorsys
from PIL import Image, ImageDraw, ImageFont
import post_process2 as post_process
conf_thres=0.5
nms_thres=0.3
letterbox=False
anchors_path='data/dataset2/coco_anchors.names'
classes_path='data/dataset2/coins.names'
if __name__ == '__main__':
img_path="data/img/test1.jpg"
image = Image.open(img_path)
img=post_process.get_imgges(image, letterbox)
# model_path="data/model4/test.pth"
# outputs=prediect(img)
# conv_sbbox=outputs[0].detach().numpy()
# conv_mbbox=outputs[1].detach().numpy()
# conv_lbbox=outputs[2].detach().numpy()
# np.save("data/test/conv_sbbox.npy", conv_sbbox)
# np.save("data/test/conv_mbbox.npy", conv_mbbox)
# np.save("data/test/conv_lbbox.npy", conv_lbbox)
conv_sbbox=np.load("data/data2/conv_sbbox.npy")
conv_mbbox=np.load("data/data2/conv_mbbox.npy")
conv_lbbox=np.load("data/data2/conv_lbbox.npy")
print(conv_sbbox.shape, conv_mbbox.shape, conv_lbbox.shape)
input_size=(416, 416)
class_names = post_process.get_class(classes_path)
decode_sbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[0], len(class_names), input_size, conv_sbbox)
decode_mbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[1], len(class_names), input_size, conv_mbbox)
decode_lbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[2], len(class_names), input_size, conv_lbbox)
output = np.concatenate([decode_sbbox, decode_mbbox, decode_lbbox], 1)
print(decode_sbbox.shape, decode_mbbox.shape, decode_lbbox.shape, output.shape)
batch_detections = post_process.non_max_suppression2(output, len(class_names), conf_thres=conf_thres, nms_thres=nms_thres)
print(batch_detections)
try:
batch_detections = np.array(batch_detections[0])
bbox_nums=np.array(batch_detections[0]).shape[0]
except:
print("没有检测结果!")
exit()
boxes, top_conf, top_label=post_process.Regression(batch_detections, conf_thres, image, letterbox)
post_process.draw_box(boxes, top_conf, top_label, class_names, image)