尼古拉斯·two_dog

pytorch实现yolov4_v1（数据处理+训练测试+转模型）

参考链接：

https://blog.csdn.net/qq_44876051/article/details/107665310?utm_medium=distribute.pc_relevant.none-task-blog-baidujs_baidulandingword-2&spm=1001.2101.3001.4242

https://www.cnblogs.com/wujianming-110117/p/13845974.html

pytorch代码实现：https://github.com/bubbliiiing/yolov4-pytorch

主要修改点：
1. 修改upsample算子的实现，使用interpolate代替（修改yolo4.py）
2. 修改模型权重加载方法，排除upsample的权重加载，否则会报错（修改train.py）

模型训练+推理步骤：
1. 下载代码和预训练模型，准备数据
2. 数据预处理：使用json2xml.py、kmeans_for_anchors.py、voc2yolo4.py和voc_annotation.py对自己标注的数据进行处理
3. 训练模型+测试模型（修改train.py和predict.py）
4. 计算map（修改get_dr_txt.py、get_gt_txt.py、get_map.py，需要注意数据集的格式）
5. pytorch模型转onnx模型（修改pt2onnx.py）
6. 测试pth模型+onnx模型（修改test_pth.py、test_onnx.py）
7. onnx转om模型（atc命令如下）
8. 测试om模型（修改pyacl代码，不使用等比例缩放）
9. 对比结果（把om模型的输出拿出来，放到test_om.py / test_om2.py中测试，对比本地模型和atlas模型的结果）
备注：test_om.py输入的是om模型经过dvpp+aipp的输入数据，test_om2.py输入的是三个feature_map。

结果对比结论：
1. 本地对比了pytorch和onnx模型，结果保持一致
2. 把om模型的输入截出来，当做pth或onnx模型的输入，得到的结果和om模型的真实输出结果相差不大，证明om模型转换成功
3.om模型结果和本地pth或onnx模型结果相差不大，证明模型迁移成功

一、数据处理

1.数据处理步骤：

数据标注 -> yolo格式转换 -> 计算anchors

具体代码步骤参考上一篇文章：https://blog.csdn.net/gm_Ergou/article/details/118570318

2.数据提供：（25张硬币数据集）

train.txt

data/dataset2/coins/P00524-151911.jpg 1660,402,2145,894,0 2546,714,3025,1205,0 2929,1121,3408,1612,0 2175,1205,2642,1666,0 1840,2091,2307,2552,0 1444,870,1929,1349,1 2642,1923,3037,2331,1 941,1810,1301,2175,2 1762,1301,2139,1660,1
data/dataset2/coins/P00524-151918.jpg 923,534,1301,894,0 2786,343,3151,726,0 1882,798,2235,1157,0 1888,1666,2265,2043,0 2666,1780,3025,2139,0 2331,2001,2666,2355,0 1013,1450,1325,1768,1 989,2432,1277,2696,2 2738,965,3025,1253,1
data/dataset2/coins/P00524-151929.jpg 2678,295,3073,714,0 2211,558,2618,965,0 2450,1109,2822,1492,0 3265,1444,3648,1804,0 989,965,1373,1361,0 774,1552,1157,1911,0 1085,1995,1420,2313,0 1666,2121,2007,2450,0 1935,349,2247,678,1 1253,2355,1540,2618,1 1756,1540,2043,1792,2
data/dataset2/coins/P00524-151944.jpg 1444,355,1828,762,0 2402,343,2786,756,0 690,678,1085,1061,0 941,1193,1283,1540,0 1504,1265,1852,1588,0 1947,1241,2307,1564,0 2103,1660,2450,1959,0 989,2115,1283,2402,0 1756,2546,1995,2768,1 1001,2391,1277,2600,1 3049,917,3325,1193,2 1666,2091,1911,2307,2
data/dataset2/coins/P00524-151953.jpg 822,810,1205,1211,0 1283,642,1684,1037,0 1636,1049,2019,1397,0 1349,1349,1732,1660,0 1205,1911,1516,2211,0 1947,2097,2259,2379,0 2594,1067,2953,1420,0 2558,1738,2905,2049,0 3145,1349,3456,1660,1 774,2319,1049,2546,1 2450,630,2750,929,2 2211,1426,2474,1684,2
data/dataset2/coins/P00524-152004.jpg 1696,223,2145,702,0 2810,684,3289,1133,0 1301,1444,1684,1834,0 834,1522,1253,1899,0 1349,2391,1684,2714,0 2127,1013,2474,1349,1 630,1205,953,1540,1 2127,1468,2414,1744,2 1043,1888,1337,2139,2
data/dataset2/coins/P00524-152030.jpg 1211,732,1618,1073,0 2858,588,3229,917,0 2498,852,2888,1181,0 2187,1187,2618,1540,0 1756,1205,2175,1564,0 923,1540,1385,1971,0 666,2271,1085,2720,1 2043,798,2379,1061,1 1684,1995,2031,2367,2 3169,1492,3504,1792,2
data/dataset2/coins/P00524-152038.jpg 1780,355,2151,702,0 2714,726,3085,1085,0 1738,1229,2115,1612,0 2379,1876,2768,2283,0 1420,1876,1804,2259,0 1373,1397,1732,1762,0 1109,1235,1420,1540,1 2502,2299,2806,2635,1 2630,1229,2911,1504,2 2187,2385,2474,2690,2
data/dataset2/coins/P00524-152052.jpg 1977,582,2438,1025,0 2744,564,3193,1013,0 3241,1043,3708,1516,0 2570,989,3037,1432,0 2840,1738,3349,2235,0 2546,2241,3073,2768,0 1095,1307,1478,1684,1 1636,1289,2007,1684,1 1385,2067,1768,2474,2 2355,1756,2702,2097,2
data/dataset2/coins/P00524-152108.jpg 2235,229,2642,612,0 2534,702,2983,1109,0 2630,1193,3097,1636,0 2067,1426,2534,1876,0 2031,810,2474,1229,0 1929,2211,2402,2696,0 1522,612,1864,941,1 1636,1516,1995,1876,1 1636,1121,1947,1432,2 2474,1732,2810,2055,2
data/dataset2/coins/P00524-152122.jpg 708,714,1205,1091,0 1043,995,1540,1397,0 2786,852,3289,1259,0 1402,2163,2043,2822,0 2289,1732,2870,2283,0 2786,2067,3408,2690,0 2091,1277,2522,1636,1 756,1756,1229,2199,1 1756,738,2091,995,2 1612,1492,2019,1852,2
data/dataset2/coins/P00524-152144.jpg 438,1349,786,1732,0 1277,1109,1672,1468,0 1450,612,1828,947,0 1708,1061,2091,1444,0 1097,1804,1540,2235,0 2582,1636,3055,2091,0 2163,1013,2510,1307,1 941,1456,1259,1780,1 1684,1588,1983,1899,2 1947,1852,2265,2187,2
data/dataset2/coins/P00524-152155.jpg 1528,738,1953,1115,0 2870,941,3325,1337,0 2762,1426,3253,1876,0 2426,2295,2953,2840,0 1714,1379,2163,1804,0 1385,1732,1840,2187,0 810,1552,1163,1899,1 2295,1420,2666,1768,1 1474,253,1780,498,2 2163,307,2474,558,2
data/dataset2/coins/P00524-152208.jpg 822,498,1253,798,0 1253,684,1732,989,0 2450,965,2941,1325,0 989,1145,1516,1540,0 1756,1474,2313,1953,0 2019,1947,2666,2546,0 1157,1762,1684,2211,1 2834,1792,3337,2265,1 1720,606,2043,798,2 2073,714,2414,965,2
data/dataset2/coins/P00524-152213.jpg 678,2139,1325,2726,0 2379,1738,2929,2247,0 2894,1379,3397,1810,0 2169,1139,2648,1516,0 2534,702,2953,995,0 1504,618,1923,894,0 1337,1995,1828,2450,1 2642,1085,3001,1349,1 2283,822,2594,1061,2 1522,438,1804,606,2
data/dataset2/coins/P00524-152222.jpg 2365,544,2748,874,0 1025,1636,1420,2007,0 1660,1642,2043,2043,0 2283,1444,2690,1834,0 2570,1816,2977,2235,0 1690,2343,2115,2786,0 1876,1253,2187,1564,1 965,2313,1307,2666,1 1397,391,1684,630,2 1019,870,1277,1115,2
data/dataset2/coins/P00524-152244.jpg 1426,870,1852,1307,0 2426,564,2840,995,0 2690,1115,3121,1540,0 1402,1624,1804,2043,0 1905,1971,2331,2402,0 1814,1546,2149,1882,1 1331,2187,1684,2546,1 2498,1444,2816,1756,2 1133,1492,1450,1804,2
data/dataset2/coins/P00524-152253.jpg 1660,235,2031,630,0 2450,929,2810,1307,0 750,1516,1133,1888,0 1408,1397,1804,1780,0 2600,1301,2983,1666,0 2570,2385,2935,2786,0 1971,1642,2283,1947,1 2726,1828,3037,2139,1 1636,1145,1911,1432,2 2283,2067,2558,2343,2
data/dataset2/coins/P00524-152300.jpg 2331,379,2696,750,0 2199,756,2570,1109,0 1852,929,2211,1289,0 1540,1163,1899,1516,0 1109,1468,1498,1834,0 1899,1923,2265,2289,0 2846,1277,3157,1564,1 1911,1379,2211,1660,1 3121,1852,3385,2139,2 1283,1816,1564,2091,2
data/dataset2/coins/P00524-152329.jpg 1792,211,2223,654,0 1115,989,1570,1444,0 971,1684,1432,2175,0 1804,1612,2259,2073,0 2762,1552,3217,1977,0 2271,2402,2690,2840,0 2067,1109,2426,1468,1 1355,1397,1732,1756,1 2582,1253,2911,1588,2 2624,2019,2989,2343,2
data/dataset2/coins/P00524-152335.jpg 1720,247,2151,678,0 1133,1253,1540,1708,0 1780,1301,2187,1732,0 2259,1696,2666,2139,0 1037,1899,1450,2355,0 1876,2247,2307,2678,0 2139,1025,2480,1349,1 1516,1995,1882,2331,1 534,1349,846,1684,2 2917,1708,3265,2031,2
data/dataset2/coins/P00524-152343.jpg 1738,349,2139,750,0 810,654,1253,1085,0 582,1474,1061,1923,0 1840,1408,2271,1858,0 1642,1995,2115,2474,0 3289,1492,3702,1905,0 2163,905,2480,1229,1 1241,1061,1612,1408,1 953,1929,1325,2283,2 2576,870,2894,1205,2
data/dataset2/coins/P00524-152351.jpg 1259,343,1672,798,0 2642,582,3025,1013,0 3181,1504,3576,1882,0 2283,1576,2666,1971,0 1301,1313,1714,1744,0 397,2283,822,2690,0 1073,899,1444,1253,1 2025,1061,2367,1397,1 3325,1181,3624,1468,2 1720,1546,2019,1852,2
data/dataset2/coins/P00524-152401.jpg 1253,486,1684,923,0 2103,462,2546,894,0 1450,894,1876,1325,0 1947,1133,2367,1570,0 1905,1600,2337,2019,0 965,1905,1397,2355,0 1037,882,1397,1235,1 1037,1528,1402,1876,1 3181,630,3492,965,2 1540,2199,1858,2546,2
data/dataset2/coins/P00524-152409.jpg 1163,564,1660,1061,0 1923,103,2355,570,0 2313,349,2696,822,0 2666,804,3013,1241,0 965,1756,1474,2283,0 2534,1564,2905,2007,0 582,510,1037,965,1 1504,1444,1876,1828,1 2259,2211,2570,2546,2 3001,582,3265,894,2

coco_anchors.names

12, 16,  19, 36,  40, 28,  36, 75,  76, 55,  72, 146,  142, 110,  192, 243,  459, 401

coins.names

1yuan
5jiao
1jiao

测试图片：从数据集中抽出来的

二、模型训练

1.train.py

# -*- coding: utf-8 -*-  
#-------------------------------------#
#       对数据集进行训练
#-------------------------------------#
import os
import time

import numpy as np
import torch
import torch.backends.cudnn as cudnn
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
from torch.utils.data import DataLoader
from tqdm import tqdm

from yolo4 import YoloBody
from yolo_training import Generator, YOLOLoss
from dataloader import YoloDataset, yolo_dataset_collate


#---------------------------------------------------#
#   获得类和先验框
#---------------------------------------------------#
def get_classes(classes_path):
    '''loads the classes'''
    with open(classes_path) as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names

def get_anchors(anchors_path):
    '''loads the anchors from a file'''
    with open(anchors_path) as f:
        anchors = f.readline()
    anchors = [float(x) for x in anchors.split(',')]
    return np.array(anchors).reshape([-1,3,2])[::-1,:,:]

def get_lr(optimizer):
    for param_group in optimizer.param_groups:
        return param_group['lr']

        
def fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,genval,Epoch,cuda):
    total_loss = 0
    val_loss = 0

    net.train()
    with tqdm(total=epoch_size,desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
        for iteration, batch in enumerate(gen):
            if iteration >= epoch_size:
                break
            images, targets = batch[0], batch[1]
            with torch.no_grad():
                if cuda:
                    images = Variable(torch.from_numpy(images).type(torch.FloatTensor)).cuda()
                    targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]
                else:
                    images = Variable(torch.from_numpy(images).type(torch.FloatTensor))
                    targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]

            #----------------------#
            #   清零梯度
            #----------------------#
            optimizer.zero_grad()
            #----------------------#
            #   前向传播
            #----------------------#
            outputs = net(images)
            losses = []
            num_pos_all = 0
            #----------------------#
            #   计算损失
            #----------------------#
            for i in range(3):
                loss_item, num_pos = yolo_losses[i](outputs[i], targets)
                losses.append(loss_item)
                num_pos_all += num_pos

            loss = sum(losses) / num_pos_all
            #----------------------#
            #   反向传播
            #----------------------#
            loss.backward()
            optimizer.step()

            total_loss += loss.item()
            pbar.set_postfix(**{'total_loss': total_loss / (iteration + 1), 'lr'        : get_lr(optimizer)})
            pbar.update(1)


    net.eval()
    print('Start Validation')
    with tqdm(total=epoch_size_val, desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
        for iteration, batch in enumerate(genval):
            if iteration >= epoch_size_val:
                break
            images_val, targets_val = batch[0], batch[1]

            with torch.no_grad():
                if cuda:
                    images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor)).cuda()
                    targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
                else:
                    images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor))
                    targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
                optimizer.zero_grad()
                outputs = net(images_val)
                losses = []
                num_pos_all = 0
                for i in range(3):
                    loss_item, num_pos = yolo_losses[i](outputs[i], targets_val)
                    losses.append(loss_item)
                    num_pos_all += num_pos
                loss = sum(losses) / num_pos_all
                val_loss += loss.item()
            pbar.set_postfix(**{'total_loss': val_loss / (iteration + 1)})
            pbar.update(1)
    print('Finish Validation')
    print('Epoch:'+ str(epoch+1) + '/' + str(Epoch))
    print('Total Loss: %.4f || Val Loss: %.4f ' % (total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))

    if (epoch+1)%20==0:
        print('Saving state, iter:', str(epoch+1))
        # torch.save(model.state_dict(), 'data/model3/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
        torch.save(model, 'data/model1/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))



if __name__ == "__main__":
    Cuda = False
    #   Dataloder的使用
    Use_Data_Loader = True
    normalize = False
    input_shape = (416,416)
    anchors_path = 'data/dataset2/coco_anchors.names'
    classes_path = 'data/dataset2/coins.names'

    #   获取classes和anchor
    class_names = get_classes(classes_path)
    anchors = get_anchors(anchors_path)
    num_classes = len(class_names)
    print("class_num", num_classes)
    
    #------------------------------------------------------#
    mosaic = False  # mosaic 马赛克数据增强, 实际测试时mosaic数据增强并不稳定，所以默认为False
    Cosine_lr = False  # Cosine_scheduler 余弦退火学习率 True or False
    smoooth_label = 0  # label_smoothing 标签平滑 0.01以下一般 如0.01、0.005

    #------------------------------------------------------#
    model_path = "data/model/yolo4_weights.pth"
    print('Loading weights into state dict...')

    model = YoloBody(len(anchors[0]), num_classes)
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model_dict = model.state_dict()
    pretrained_dict = torch.load(model_path, map_location=device)
    # 因为换了upsample，权重加载时会找不到节点，因此这里需要排除upsample的权重加载
    pretrained_dict = {k: v for k, v in pretrained_dict.items() if k.find('upsample')==-1 if np.shape(model_dict[k]) ==  np.shape(v)}
    # #原始加载权重方法
    # pretrained_dict = {k: v for k, v in pretrained_dict.items() if np.shape(model_dict[k]) ==  np.shape(v)}
    model_dict.update(pretrained_dict)
    model.load_state_dict(model_dict)
    # print(model)

    net = model.train()
    if Cuda:
        net = torch.nn.DataParallel(model)
        cudnn.benchmark = True
        net = net.cuda()

    # 建立loss函数
    yolo_losses = []
    for i in range(3):
        yolo_losses.append(YOLOLoss(np.reshape(anchors,[-1,2]),num_classes, \
                                (input_shape[1], input_shape[0]), smoooth_label, Cuda, normalize))

    #-----------------dataset------------------------#
    annotation_path = 'data/dataset2/coins/train.txt'
    val_split = 0.1
    with open(annotation_path) as f:
        lines = f.readlines()
    np.random.seed(10101)
    np.random.shuffle(lines)
    np.random.seed(None)
    num_val = int(len(lines)*val_split)
    num_train = len(lines) - num_val
    

    #------------------------------------------------------#
    #   主干特征提取网络特征通用，冻结训练可以加快训练速度
    #   也可以在训练初期防止权值被破坏。
    #   Init_Epoch为起始世代
    #   Freeze_Epoch为冻结训练的世代
    #   Epoch总训练世代
    #   提示OOM或者显存不足请调小Batch_size
    #------------------------------------------------------#
    if True:
        lr = 1e-3
        Batch_size = 2
        Init_Epoch = 0
        Freeze_Epoch = 200
        
        optimizer = optim.Adam(net.parameters(),lr)
        if Cosine_lr:
            lr_scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=5, eta_min=1e-5)
        else:
            lr_scheduler = optim.lr_scheduler.StepLR(optimizer,step_size=1,gamma=0.92)

        if Use_Data_Loader:
            train_dataset = YoloDataset(lines[:num_train], (input_shape[0], input_shape[1]), mosaic=mosaic, is_train=True)
            val_dataset = YoloDataset(lines[num_train:], (input_shape[0], input_shape[1]), mosaic=False, is_train=False)

            gen = DataLoader(train_dataset, shuffle=True, batch_size=Batch_size, num_workers=4, pin_memory=True,
                                    drop_last=True, collate_fn=yolo_dataset_collate)
            gen_val = DataLoader(val_dataset, shuffle=True, batch_size=Batch_size, num_workers=4,pin_memory=True, 
                                    drop_last=True, collate_fn=yolo_dataset_collate)
        else:
            gen = Generator(Batch_size, lines[:num_train],
                            (input_shape[0], input_shape[1])).generate(train=True, mosaic = mosaic)
            gen_val = Generator(Batch_size, lines[num_train:],
                            (input_shape[0], input_shape[1])).generate(train=False, mosaic = mosaic)

        
        #------------------------------------#
        #   冻结一定部分训练
        #------------------------------------#
        for param in model.backbone.parameters():
            param.requires_grad = True

        epoch_size = max(1, num_train//Batch_size)
        epoch_size_val = num_val//Batch_size
        for epoch in range(Init_Epoch,Freeze_Epoch):
            fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,gen_val,Freeze_Epoch,Cuda)
            lr_scheduler.step()

2.predict.py

# -*- coding: utf-8 -*-  
#-------------------------------------#
#       对数据集进行训练
#-------------------------------------#
import os
import time

import numpy as np
import torch
import torch.backends.cudnn as cudnn
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
from torch.utils.data import DataLoader
from tqdm import tqdm

from yolo4 import YoloBody
from yolo_training import Generator, YOLOLoss
from dataloader import YoloDataset, yolo_dataset_collate


#---------------------------------------------------#
#   获得类和先验框
#---------------------------------------------------#
def get_classes(classes_path):
    '''loads the classes'''
    with open(classes_path) as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names

def get_anchors(anchors_path):
    '''loads the anchors from a file'''
    with open(anchors_path) as f:
        anchors = f.readline()
    anchors = [float(x) for x in anchors.split(',')]
    return np.array(anchors).reshape([-1,3,2])[::-1,:,:]

def get_lr(optimizer):
    for param_group in optimizer.param_groups:
        return param_group['lr']

        
def fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,genval,Epoch,cuda):
    total_loss = 0
    val_loss = 0

    net.train()
    with tqdm(total=epoch_size,desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
        for iteration, batch in enumerate(gen):
            if iteration >= epoch_size:
                break
            images, targets = batch[0], batch[1]
            with torch.no_grad():
                if cuda:
                    images = Variable(torch.from_numpy(images).type(torch.FloatTensor)).cuda()
                    targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]
                else:
                    images = Variable(torch.from_numpy(images).type(torch.FloatTensor))
                    targets = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets]

            #----------------------#
            #   清零梯度
            #----------------------#
            optimizer.zero_grad()
            #----------------------#
            #   前向传播
            #----------------------#
            outputs = net(images)
            losses = []
            num_pos_all = 0
            #----------------------#
            #   计算损失
            #----------------------#
            for i in range(3):
                loss_item, num_pos = yolo_losses[i](outputs[i], targets)
                losses.append(loss_item)
                num_pos_all += num_pos

            loss = sum(losses) / num_pos_all
            #----------------------#
            #   反向传播
            #----------------------#
            loss.backward()
            optimizer.step()

            total_loss += loss.item()
            pbar.set_postfix(**{'total_loss': total_loss / (iteration + 1), 'lr'        : get_lr(optimizer)})
            pbar.update(1)


    net.eval()
    print('Start Validation')
    with tqdm(total=epoch_size_val, desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) as pbar:
        for iteration, batch in enumerate(genval):
            if iteration >= epoch_size_val:
                break
            images_val, targets_val = batch[0], batch[1]

            with torch.no_grad():
                if cuda:
                    images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor)).cuda()
                    targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
                else:
                    images_val = Variable(torch.from_numpy(images_val).type(torch.FloatTensor))
                    targets_val = [Variable(torch.from_numpy(ann).type(torch.FloatTensor)) for ann in targets_val]
                optimizer.zero_grad()
                outputs = net(images_val)
                losses = []
                num_pos_all = 0
                for i in range(3):
                    loss_item, num_pos = yolo_losses[i](outputs[i], targets_val)
                    losses.append(loss_item)
                    num_pos_all += num_pos
                loss = sum(losses) / num_pos_all
                val_loss += loss.item()
            pbar.set_postfix(**{'total_loss': val_loss / (iteration + 1)})
            pbar.update(1)
    print('Finish Validation')
    print('Epoch:'+ str(epoch+1) + '/' + str(Epoch))
    print('Total Loss: %.4f || Val Loss: %.4f ' % (total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))

    if (epoch+1)%20==0:
        print('Saving state, iter:', str(epoch+1))
        # torch.save(model.state_dict(), 'data/model3/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))
        torch.save(model, 'data/model1/Epoch%d-Total_Loss%.4f-Val_Loss%.4f.pth'%((epoch+1),total_loss/(epoch_size+1),val_loss/(epoch_size_val+1)))



if __name__ == "__main__":
    Cuda = False
    #   Dataloder的使用
    Use_Data_Loader = True
    normalize = False
    input_shape = (416,416)
    anchors_path = 'data/dataset2/coco_anchors.names'
    classes_path = 'data/dataset2/coins.names'

    #   获取classes和anchor
    class_names = get_classes(classes_path)
    anchors = get_anchors(anchors_path)
    num_classes = len(class_names)
    print("class_num", num_classes)
    
    #------------------------------------------------------#
    mosaic = False  # mosaic 马赛克数据增强, 实际测试时mosaic数据增强并不稳定，所以默认为False
    Cosine_lr = False  # Cosine_scheduler 余弦退火学习率 True or False
    smoooth_label = 0  # label_smoothing 标签平滑 0.01以下一般 如0.01、0.005

    #------------------------------------------------------#
    model_path = "data/model/yolo4_weights.pth"
    print('Loading weights into state dict...')

    model = YoloBody(len(anchors[0]), num_classes)
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model_dict = model.state_dict()
    pretrained_dict = torch.load(model_path, map_location=device)
    # 因为换了upsample，权重加载时会找不到节点，因此这里需要排除upsample的权重加载
    pretrained_dict = {k: v for k, v in pretrained_dict.items() if k.find('upsample')==-1 if np.shape(model_dict[k]) ==  np.shape(v)}
    # #原始加载权重方法
    # pretrained_dict = {k: v for k, v in pretrained_dict.items() if np.shape(model_dict[k]) ==  np.shape(v)}
    model_dict.update(pretrained_dict)
    model.load_state_dict(model_dict)
    # print(model)

    net = model.train()
    if Cuda:
        net = torch.nn.DataParallel(model)
        cudnn.benchmark = True
        net = net.cuda()

    # 建立loss函数
    yolo_losses = []
    for i in range(3):
        yolo_losses.append(YOLOLoss(np.reshape(anchors,[-1,2]),num_classes, \
                                (input_shape[1], input_shape[0]), smoooth_label, Cuda, normalize))

    #-----------------dataset------------------------#
    annotation_path = 'data/dataset2/coins/train.txt'
    val_split = 0.1
    with open(annotation_path) as f:
        lines = f.readlines()
    np.random.seed(10101)
    np.random.shuffle(lines)
    np.random.seed(None)
    num_val = int(len(lines)*val_split)
    num_train = len(lines) - num_val
    

    #------------------------------------------------------#
    #   主干特征提取网络特征通用，冻结训练可以加快训练速度
    #   也可以在训练初期防止权值被破坏。
    #   Init_Epoch为起始世代
    #   Freeze_Epoch为冻结训练的世代
    #   Epoch总训练世代
    #   提示OOM或者显存不足请调小Batch_size
    #------------------------------------------------------#
    if True:
        lr = 1e-3
        Batch_size = 2
        Init_Epoch = 0
        Freeze_Epoch = 200
        
        optimizer = optim.Adam(net.parameters(),lr)
        if Cosine_lr:
            lr_scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=5, eta_min=1e-5)
        else:
            lr_scheduler = optim.lr_scheduler.StepLR(optimizer,step_size=1,gamma=0.92)

        if Use_Data_Loader:
            train_dataset = YoloDataset(lines[:num_train], (input_shape[0], input_shape[1]), mosaic=mosaic, is_train=True)
            val_dataset = YoloDataset(lines[num_train:], (input_shape[0], input_shape[1]), mosaic=False, is_train=False)

            gen = DataLoader(train_dataset, shuffle=True, batch_size=Batch_size, num_workers=4, pin_memory=True,
                                    drop_last=True, collate_fn=yolo_dataset_collate)
            gen_val = DataLoader(val_dataset, shuffle=True, batch_size=Batch_size, num_workers=4,pin_memory=True, 
                                    drop_last=True, collate_fn=yolo_dataset_collate)
        else:
            gen = Generator(Batch_size, lines[:num_train],
                            (input_shape[0], input_shape[1])).generate(train=True, mosaic = mosaic)
            gen_val = Generator(Batch_size, lines[num_train:],
                            (input_shape[0], input_shape[1])).generate(train=False, mosaic = mosaic)

        
        #------------------------------------#
        #   冻结一定部分训练
        #------------------------------------#
        for param in model.backbone.parameters():
            param.requires_grad = True

        epoch_size = max(1, num_train//Batch_size)
        epoch_size_val = num_val//Batch_size
        for epoch in range(Init_Epoch,Freeze_Epoch):
            fit_one_epoch(net,yolo_losses,epoch,epoch_size,epoch_size_val,gen,gen_val,Freeze_Epoch,Cuda)
            lr_scheduler.step()

3.post_process.py

from yolo4 import YoloBody
import torch
from PIL import Image
from torchvision import transforms
import cv2
import numpy as np
from utils import (DecodeBox, bbox_iou, letterbox_image,non_max_suppression, yolo_correct_boxes)
import os
import colorsys
from PIL import Image, ImageDraw, ImageFont


def get_class(classes_path):
    classes_path = os.path.expanduser(classes_path)
    with open(classes_path) as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names

def get_anchors(anchors_path):
    anchors_path = os.path.expanduser(anchors_path)
    with open(anchors_path) as f:
        anchors = f.readline()
    anchors = [float(x) for x in anchors.split(',')]
    return np.array(anchors).reshape([-1, 3, 2])[::-1,:,:]

def get_letterbox_image(image, size):
    iw, ih = image.size
    w, h = size
    scale = min(w/iw, h/ih)
    nw = int(iw*scale)
    nh = int(ih*scale)

    image = image.resize((nw,nh), Image.BICUBIC)
    new_image = Image.new('RGB', size, (128,128,128))
    new_image.paste(image, ((w-nw)//2, (h-nh)//2))
    return new_image


confidence=0.5
letterbox_image=False
anchors_path='data/dataset2/coco_anchors.names'
classes_path='data/dataset2/coins.names'
model_path="data/model1/test.pth"

class_names = get_class(classes_path)
# 画框设置不同的颜色
hsv_tuples = [(x / len(class_names), 1., 1.)
for x in range(len(class_names))]
colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),colors))


def result(outputs, image):
    # 模型后处理
    output_list = []
    for i in range(3):
        print(type(outputs[i]),outputs[i].shape)
        decodeBox=DecodeBox(get_anchors(anchors_path)[i], len(class_names),  (416, 416))
        output_list.append(decodeBox(outputs[i]))
        print(outputs[i].size(),decodeBox(outputs[i]).shape)


    output = torch.cat(output_list, 1)
    batch_detections = non_max_suppression(output, len(class_names), conf_thres=confidence, nms_thres=0.3)
    print(output.shape,batch_detections)
    try:
        batch_detections = batch_detections[0].cpu().numpy()
    except:
        return image

    # 检测框处理
    top_index = batch_detections[:,4] * batch_detections[:,5] > confidence
    top_conf = batch_detections[top_index,4]*batch_detections[top_index,5]
    top_label = np.array(batch_detections[top_index,-1],np.int32)
    top_bboxes = np.array(batch_detections[top_index,:4])
    top_xmin, top_ymin, top_xmax, top_ymax = np.expand_dims(top_bboxes[:,0],-1),np.expand_dims(top_bboxes[:,1],-1),np.expand_dims(top_bboxes[:,2],-1),np.expand_dims(top_bboxes[:,3],-1)

    #-----------------------------------------------------------------#
    #   在图像传入网络预测前会进行letterbox_image给图像周围添加灰条
    #   因此生成的top_bboxes是相对于有灰条的图像的
    #   我们需要对其进行修改，去除灰条的部分。
    #-----------------------------------------------------------------#
    
    image_shape = np.array(np.shape(image)[0:2])
    if letterbox_image:
        boxes = yolo_correct_boxes(top_ymin,top_xmin,top_ymax,top_xmax,np.array([416,416]),image_shape)
    else:
        top_xmin = top_xmin / 416 * image_shape[1]
        top_ymin = top_ymin / 416 * image_shape[0]
        top_xmax = top_xmax / 416 * image_shape[1]
        top_ymax = top_ymax / 416 * image_shape[0]
        boxes = np.concatenate([top_ymin,top_xmin,top_ymax,top_xmax], axis=-1)
        
    # font = ImageFont.truetype(font='/usr/share/fonts/truetype/lyx/cmr10.ttf',size=np.floor(3e-2 * np.shape(image)[1] + 0.5).astype('int32'))
    font = ImageFont.truetype(font='data/simhei.ttf',size=np.floor(3e-2 * np.shape(image)[1] + 0.5).astype('int32'))

    thickness = max((np.shape(image)[0] + np.shape(image)[1]) // 416, 1)

    for i, c in enumerate(top_label):
        predicted_class = class_names[c]
        score = top_conf[i]

        top, left, bottom, right = boxes[i]
        top = top - 5
        left = left - 5
        bottom = bottom + 5
        right = right + 5

        top = max(0, np.floor(top + 0.5).astype('int32'))
        left = max(0, np.floor(left + 0.5).astype('int32'))
        bottom = min(np.shape(image)[0], np.floor(bottom + 0.5).astype('int32'))
        right = min(np.shape(image)[1], np.floor(right + 0.5).astype('int32'))

        # 画框框
        label = '{} {:.2f}'.format(predicted_class, score)
        draw = ImageDraw.Draw(image)
        label_size = draw.textsize(label, font)
        label = label.encode('utf-8')
        print(label, top, left, bottom, right)
        
        if top - label_size[1] >= 0:
            text_origin = np.array([left, top - label_size[1]])
        else:
            text_origin = np.array([left, top + 1])

        for i in range(thickness):
            draw.rectangle(
                [left + i, top + i, right - i, bottom - i],
                outline=colors[class_names.index(predicted_class)])
        draw.rectangle(
            [tuple(text_origin), tuple(text_origin + label_size)],
            fill=colors[class_names.index(predicted_class)])
        draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)
    # image.show()

def prediect(img):
    # 模型加载
    device = torch.device('cpu')
    model=torch.load(model_path)
    model=model.to(device)

    # 模型预测
    # img = torch.from_numpy(img)
    img = torch.tensor(img, dtype=torch.float32)
    torch.no_grad()
    outputs = model(img)

    return outputs

def get_imgges(image):
    # 数据处理
    image_shape = np.array(np.shape(image)[0:2])
    print(type(image))

    if letterbox_image:
        crop_img = np.array(get_letterbox_image(image, (416,416)))
    else:
        crop_img = image.convert('RGB')
        crop_img = crop_img.resize((416,416), Image.BICUBIC)
    photo = np.array(crop_img,dtype = np.float32) / 255.0
    photo = np.transpose(photo, (2, 0, 1))
    img = [photo]
    img=np.asarray(img)

    return img

if __name__ == '__main__':
    img_path="data/img/test1.jpg"
    image = Image.open(img_path)
    img=get_imgges(image)
    
    outputs=prediect(img)
    result(outputs, image)

4.test_pth.py

import os
from PIL import Image, ImageDraw, ImageFont
from post_process import *

img_path="data/img/test1.jpg"
image = Image.open(img_path)
img=get_imgges(image)

outputs=prediect(img)
result(outputs, image)

print(outputs[0].shape, outputs[1].shape, outputs[2].shape)

5.pth3onnx.py

import torch


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = torch.load("data/model1/test.pth") # pytorch模型加载
model.eval()

input_shape=list(map(int, "1,3,416,416".split(",")))
x = torch.randn(input_shape)   # 生成张量
x = x.to(device)

export_onnx_file = "data/model1/test.onnx"      # 目的ONNX文件名
#torch.onnx.export(model, x, export_onnx_file, verbose=True)
torch.onnx.export(model, x, export_onnx_file, verbose=True, export_params=True, do_constant_folding=True, opset_version=11)

6.test_onnx_v1.py

import cv2
import numpy as np
import onnxruntime as rt
import torch
from PIL import Image
from post_process import *


def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

def onnx_runtime(img):
    sess = rt.InferenceSession("data/model1/test.onnx")
    inputs = {sess.get_inputs()[0].name: img}
    output = sess.run(None, inputs)

    outputs=[]
    for i in range(3):
        outputs.append(torch.from_numpy(output[i]))

    outputs=tuple(outputs)
    return outputs

img_path="data/img/test1.jpg"
image = Image.open(img_path)
img=get_imgges(image)
print(img.shape)

outputs=onnx_runtime(img)
result(outputs, image)
print(type(outputs), outputs[0].shape, outputs[1].shape, outputs[2].shape)

7.test_onnx_v2.py

import numpy as np
import torch
import onnx
import onnxruntime as rt
import pickle

# 测试数据
x = torch.randn(1,3,416,416, requires_grad=False)

# 使用 ONNX 的 API 检查 ONNX 模型
onnx_model = onnx.load("data/model1/test.onnx")
onnx.checker.check_model(onnx_model)

# onnx模型测试
sess = rt.InferenceSession("data/model1/test.onnx")
def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

 #结果输出
ort_inputs = {sess.get_inputs()[0].name: to_numpy(x)}
ort_outs = sess.run(None, ort_inputs)
print(x.shape, ort_outs[0].shape)

# torch模型测试
model=torch.load("data/model1/test.pth",map_location='cpu')
model.eval()
torch_out = model(x)
print(x.shape, torch_out[0].shape)

# 比较ONNX 和 PyTorch 的结果
np.testing.assert_allclose(to_numpy(torch_out[0]), ort_outs[0], rtol=1e-03, atol=1e-05)
print("模型没有太大差异!")

三、修改后处理（使用yolov3的后处理）

这里改用yolov3的后处理方式，把推理结果从大中小三个输出框截断，然后对接decode层和nms层，参考：https://blog.csdn.net/gm_Ergou/article/details/118573834

1.post_process2.py

import cv2
import numpy as np
import os
import colorsys
from PIL import Image, ImageDraw, ImageFont



def get_class(classes_path):
    classes_path = os.path.expanduser(classes_path)
    with open(classes_path) as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names

def get_anchors(anchors_path):
    anchors_path = os.path.expanduser(anchors_path)
    with open(anchors_path) as f:
        anchors = f.readline()
    anchors = [float(x) for x in anchors.split(',')]
    return np.array(anchors).reshape([-1, 3, 2])[::-1,:,:]

def sigmoid(x):
    x_ravel = x.ravel()  # 将numpy数组展平
    length = len(x_ravel)
    y = []
    for index in range(length):
        if x_ravel[index] >= 0:
            y.append(1.0 / (1 + np.exp(-x_ravel[index])))
        else:
            y.append(np.exp(x_ravel[index]) / (np.exp(x_ravel[index]) + 1))
    return np.array(y).reshape(x.shape)

def letterbox_image(image, size):
    iw, ih = image.size
    w, h = size
    scale = min(w/iw, h/ih)
    nw = int(iw*scale)
    nh = int(ih*scale)

    image = image.resize((nw,nh), Image.BICUBIC)
    new_image = Image.new('RGB', size, (128,128,128))
    new_image.paste(image, ((w-nw)//2, (h-nh)//2))
    return new_image

# 数据处理
def get_imgges(image, letterbox):
    if letterbox:
        crop_img = np.array(letterbox_image(image, (416,416)))
    else:
        crop_img = image.convert('RGB')
        crop_img = crop_img.resize((416,416), Image.BICUBIC)

    photo = np.array(crop_img,dtype = np.float32) / 255.0
    photo = np.transpose(photo, (2, 0, 1))
    img = [photo]
    img=np.asarray(img)

    return img


def DecodeBox2(anchors, num_classes, img_size, input):
    anchors = anchors
    num_anchors = len(anchors)
    num_classes = num_classes
    bbox_attrs = 5 + num_classes
    img_size = img_size

    batch_size = input.shape[0]
    input_height = input.shape[2]
    input_width = input.shape[3]
    # print(batch_size, input_height, input_width, input.shape)

    stride_h = img_size[1] / input_height
    stride_w = img_size[0] / input_width

    scaled_anchors = [(anchor_width / stride_w, anchor_height / stride_h) for anchor_width, anchor_height in anchors]

    # prediction = input.view(batch_size, num_anchors, bbox_attrs, input_height, input_width).permute(0, 1, 3, 4, 2).contiguous()
    a = input.reshape(batch_size, num_anchors, bbox_attrs, input_height, input_width).transpose(0, 1, 3, 4, 2)
    prediction = np.copy(a)
    # print(prediction, prediction.shape)

    # 先验框的中心位置的调整参数
    x = sigmoid(prediction[..., 0])  
    y = sigmoid(prediction[..., 1])
    # 先验框的宽高调整参数
    w = prediction[..., 2]
    h = prediction[..., 3]
    # 获得置信度，是否有物体
    conf = sigmoid(prediction[..., 4])
    # 种类置信度
    pred_cls = sigmoid(prediction[..., 5:])

    #   生成网格，先验框中心，网格左上角 
    grid_x = np.linspace(0, input_width - 1, input_width)
    grid_x = np.tile(np.tile(grid_x, (input_height, 1)), (batch_size * num_anchors, 1, 1))
    grid_x = grid_x.reshape(x.shape).astype(np.float16)

    grid_y = np.linspace(0, input_height - 1, input_height)
    grid_y = np.tile(np.tile(grid_y, (input_width, 1)).T, (batch_size * num_anchors, 1, 1))
    grid_y = grid_y.reshape(y.shape).astype(np.float16)
    # print(grid_y, grid_y.shape)

    # #   按照网格格式生成先验框的宽高 
    anchor_w = np.array(scaled_anchors).astype(np.float16)[:,0].reshape(len(scaled_anchors),1)  # len(scaled_anchors)=3
    anchor_h = np.array(scaled_anchors).astype(np.float16)[:,1].reshape(len(scaled_anchors),1)
    anchor_w = np.tile(np.tile(anchor_w, (batch_size, 1)), (1, 1, input_height * input_width)).reshape(w.shape)
    anchor_h = np.tile(np.tile(anchor_h, (batch_size, 1)), (1, 1, input_height * input_width)).reshape(h.shape)
    # print(anchor_w,anchor_h)
    # print(anchor_w.shape, anchor_h.shape)

    #----------------------------------------------------------#
    #   利用预测结果对先验框进行调整
    #   首先调整先验框的中心，从先验框中心向右下角偏移
    #   再调整先验框的宽高。
    #----------------------------------------------------------#
    pred_boxes = np.zeros(shape=prediction[..., :4].shape)
    pred_boxes[..., 0] = x.data + grid_x
    pred_boxes[..., 1] = y.data + grid_y
    pred_boxes[..., 2] = np.exp(w.data) * anchor_w
    pred_boxes[..., 3] = np.exp(h.data) * anchor_h
    # print(pred_boxes)

    #----------------------------------------------------------#
    #   将输出结果调整成相对于输入图像大小
    #----------------------------------------------------------#
    _scale=np.array([stride_w, stride_h] * 2).astype(np.float16)
    output = np.concatenate((pred_boxes.reshape(batch_size, -1, 4) * _scale,
        conf.reshape(batch_size, -1, 1), pred_cls.reshape(batch_size, -1, num_classes)), -1)
    
    return output      

def yolo_correct_boxes(top, left, bottom, right, input_shape, image_shape):
    new_shape = image_shape*np.min(input_shape/image_shape)

    offset = (input_shape-new_shape)/2./input_shape
    scale = input_shape/new_shape

    box_yx = np.concatenate(((top+bottom)/2,(left+right)/2),axis=-1)/input_shape
    box_hw = np.concatenate((bottom-top,right-left),axis=-1)/input_shape

    box_yx = (box_yx - offset) * scale
    box_hw *= scale

    box_mins = box_yx - (box_hw / 2.)
    box_maxes = box_yx + (box_hw / 2.)
    boxes =  np.concatenate([
        box_mins[:, 0:1],
        box_mins[:, 1:2],
        box_maxes[:, 0:1],
        box_maxes[:, 1:2]
    ],axis=-1)
    boxes *= np.concatenate([image_shape, image_shape],axis=-1)
    return boxes


def bbox_iou2(box1, box2, x1y1x2y2=True):
    """
        计算IOU
    """
    if not x1y1x2y2:
        b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0] + box1[:, 2] / 2
        b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1] + box1[:, 3] / 2
        b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2
        b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2
    else:
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3]

    inter_rect_x1 = np.maximum(b1_x1, b2_x1)
    inter_rect_y1 = np.maximum(b1_y1, b2_y1)
    inter_rect_x2 = np.minimum(b1_x2, b2_x2)
    inter_rect_y2 = np.minimum(b1_y2, b2_y2)

    data1=inter_rect_x2 - inter_rect_x1 + 1
    data2=inter_rect_y2 - inter_rect_y1 + 1
    inter_area = np.clip(data1, a_min=0, a_max=max(data1)) * np.clip(data2, a_min=0, a_max=max(data2))
    
    b1_area = (b1_x2 - b1_x1 + 1) * (b1_y2 - b1_y1 + 1)
    b2_area = (b2_x2 - b2_x1 + 1) * (b2_y2 - b2_y1 + 1)

    iou = inter_area / (b1_area + b2_area - inter_area + 1e-16)

    return iou
     
def non_max_suppression2(prediction, num_classes, conf_thres=0.5, nms_thres=0.3):
    box_corner = np.zeros(shape=prediction.shape)
    box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2
    box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2
    box_corner[:, :, 2] = prediction[:, :, 0] + prediction[:, :, 2] / 2
    box_corner[:, :, 3] = prediction[:, :, 1] + prediction[:, :, 3] / 2
    prediction[:, :, :4] = box_corner[:, :, :4]

    output = [None for _ in range(len(prediction))]
    for image_i, image_pred in enumerate(prediction):
        data=image_pred[:, 5:5 + num_classes]
        class_conf=np.max(data, axis=1).reshape(len(data),1)
        class_pred=data.argmax(axis=1).reshape(len(data),1)

        #----------------------------------------------------------#
        #   利用置信度进行第一轮筛选
        #----------------------------------------------------------#
        conf_mask = (image_pred[:, 4] * class_conf[:, 0] >= conf_thres).squeeze()

        #----------------------------------------------------------#
        #   根据置信度进行预测结果的筛选
        #----------------------------------------------------------#
        image_pred = image_pred[conf_mask]
        class_conf = class_conf[conf_mask]
        class_pred = class_pred[conf_mask]

        if len(image_pred)<=0:
            continue

        # detections  [num_anchors, 7]   7的内容为：x1, y1, x2, y2, obj_conf, class_conf, class_pred
        detections = np.concatenate((image_pred[:, :5], class_conf.astype(np.float16), class_pred.astype(np.float16)), 1)

        # 获得预测结果中包含的所有种类
        unique_labels = np.unique(detections[:, -1])

        for c in unique_labels:
            detections_class = detections[detections[:, -1] == c]
            
            # # 按照存在物体的置信度排序
            conf_sort_index = np.argsort(-(detections_class[:, 4]*detections_class[:, 5]), axis=0)
            detections_class = detections_class[conf_sort_index]

            # 进行非极大抑制
            max_detections = []
            while detections_class.shape[0]>0:
                # 取出这一类置信度最高的，一步一步往下判断，判断重合程度是否大于nms_thres，如果是则去除掉
                max_detections.append(np.expand_dims(detections_class[0],axis=0))
                if len(detections_class) == 1:
                    break
                
                ious = bbox_iou2(max_detections[-1], detections_class[1:])
                detections_class = detections_class[1:][ious < nms_thres]
                
            # 堆叠
            max_detections = np.concatenate(max_detections)

            # Add max detections to outputs
            output[image_i] = max_detections if output[image_i] is None else np.concatenate((output[image_i], max_detections))

    return output

def prediect(img):
    # 模型加载
    device = torch.device('cpu')
    model=torch.load(model_path)
    model=model.to(device)

    # 模型预测
    # img = torch.from_numpy(img)
    img = torch.tensor(img, dtype=torch.float32)
    torch.no_grad()
    outputs = model(img)

    return outputs

def Regression(batch_detections, confidence, image, letterbox):
    # 检测框处理
    top_index = batch_detections[:,4] * batch_detections[:,5] > confidence
    top_conf = batch_detections[top_index,4]*batch_detections[top_index,5]
    top_label = np.array(batch_detections[top_index,-1],np.int32)
    top_bboxes = np.array(batch_detections[top_index,:4])
    top_xmin, top_ymin, top_xmax, top_ymax = np.expand_dims(top_bboxes[:,0],-1),np.expand_dims(top_bboxes[:,1],-1),np.expand_dims(top_bboxes[:,2],-1),np.expand_dims(top_bboxes[:,3],-1)

    #-----------------------------------------------------------------#
    image_shape = np.array(np.shape(image)[0:2])
    if letterbox:
        boxes = yolo_correct_boxes(top_ymin,top_xmin,top_ymax,top_xmax,np.array([416,416]),image_shape)
    else:
        top_xmin = top_xmin / 416 * image_shape[1]
        top_ymin = top_ymin / 416 * image_shape[0]
        top_xmax = top_xmax / 416 * image_shape[1]
        top_ymax = top_ymax / 416 * image_shape[0]
        boxes = np.concatenate([top_ymin,top_xmin,top_ymax,top_xmax], axis=-1)

    return boxes.astype(np.int), top_conf, top_label

def draw_box(boxes, top_conf, top_label, class_names, image):
    font = ImageFont.truetype(font='model_data/simhei.ttf',size=np.floor(3e-2 * np.shape(image)[1] + 0.5).astype('int32'))
    thickness = max((np.shape(image)[0] + np.shape(image)[1]) // 416, 1)

    # 画框设置不同的颜色
    hsv_tuples = [(x / len(class_names), 1., 1.)
    for x in range(len(class_names))]
    colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
    colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),colors))

    for i, c in enumerate(top_label):
        predicted_class = class_names[c]
        score = top_conf[i]

        top, left, bottom, right = boxes[i]
        top = top - 5
        left = left - 5
        bottom = bottom + 5
        right = right + 5

        top = max(0, np.floor(top + 0.5).astype('int32'))
        left = max(0, np.floor(left + 0.5).astype('int32'))
        bottom = min(np.shape(image)[0], np.floor(bottom + 0.5).astype('int32'))
        right = min(np.shape(image)[1], np.floor(right + 0.5).astype('int32'))

        # 画框框
        label = '{} {:.2f}'.format(predicted_class, score)
        draw = ImageDraw.Draw(image)
        label_size = draw.textsize(label, font)
        label = label.encode('utf-8')
        print(label, top, left, bottom, right)
        
        if top - label_size[1] >= 0:
            text_origin = np.array([left, top - label_size[1]])
        else:
            text_origin = np.array([left, top + 1])

        for i in range(thickness):
            draw.rectangle(
                [left + i, top + i, right - i, bottom - i],
                outline=colors[class_names.index(predicted_class)])
        draw.rectangle(
            [tuple(text_origin), tuple(text_origin + label_size)],
            fill=colors[class_names.index(predicted_class)])
        draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font)
    # image.show()

2.test_om.py

import sys
import onnx
import os
import argparse
import numpy as np
import cv2
import onnxruntime
import torch

import colorsys
from PIL import Image, ImageDraw, ImageFont
import post_process2 as post_process


def letterbox_image2(image, size, letterbox):
    # INTER_NEAREST:最邻近插值,INTER_LINEAR:双线性插值,INTER_CUBIC:4x4像素邻域内的双立方插值,INTER_LANCZOS4:8x8像素邻域内的Lanczos插值
    if letterbox:
        ih, iw = image.shape[0:2]
        w, h = size
        scale = min(w/iw, h/ih)
        nw = int(iw*scale)
        nh = int(ih*scale)

        image = cv2.resize(image, (nw,nh), interpolation=cv2.INTER_LINEAR)
        img = np.ones((w, h,3),dtype=np.uint8)
        img[:,:]=128
        img[(h-nh)//2:(h-nh)//2+nh, (w-nw)//2:(w-nw)//2+nw]=image
    else:
        img = cv2.resize(image, size, interpolation=cv2.INTER_LINEAR) 

    # cv2.imshow('img',img)
    # cv2.waitKey(0) 
    return img

def letterbox_image(image, size):
    iw, ih = image.size
    w, h = size
    scale = min(w/iw, h/ih)
    nw = int(iw*scale)
    nh = int(ih*scale)

    image = image.resize((nw,nh), Image.BICUBIC)
    new_image = Image.new('RGB', size, (128,128,128))
    new_image.paste(image, ((w-nw)//2, (h-nh)//2))
    # new_image.show()

    return new_image


if __name__ == '__main__':
    # 参数
    conf_thres=0.5
    nms_thres=0.3
    anchors_path='data/dataset2/coco_anchors.names'
    classes_path='data/dataset2/coins.names'

    image_path="data/img/test1.jpg"
    weight_file='data/model1/test.pth'
    onnx_file_name = 'data/model1/test.onnx'
    

    # 备注：img1是工程预处理，img2是自己写的，img3是atlas的om模型输入数据
    # letterbox=True时，img1=img2!=img3，letterbox=False时，img2=img3!=img1 （由于cv和PIL的resize不一样，有小误差）
    # img1:原代码预处理
    letterbox=True
    image_src = cv2.imread(image_path)
    img1 = cv2.cvtColor(image_src, cv2.COLOR_BGR2RGB)
    img1 = letterbox_image2(img1, (416,416), letterbox)
    img1 = np.transpose(img1, (2, 0, 1)).astype(np.float32) / 255.0
    img1 = np.expand_dims(img1, axis=0)
    print(img1.shape)

    # img2:自己写的预处理，参考yolov3的
    image_src2 = Image.open(image_path)
    if letterbox:
        crop_img = np.array(letterbox_image(image_src2, (416,416)))
    else:
        crop_img = image_src2.convert('RGB')
        crop_img = crop_img.resize((416,416), Image.BILINEAR)  #NEAREST:最低质量，BILINEAR:双线性，BICUBIC:三次样条插值，ANTIALIAS:最高质量
    
    photo = np.array(crop_img,dtype = np.float32) / 255.0
    photo = np.transpose(photo, (2, 0, 1))
    img2 = np.expand_dims(photo, axis=0)
    print(img2.shape)
    
    # img3: om模型的数据输入，atc转换时截断到input层得到的数据
    img3=np.load("data/data2/input.npy")
    print(img3.shape)


    # Compute
    session = onnxruntime.InferenceSession(onnx_file_name)
    input_name = session.get_inputs()[0].name
    outputs = session.run(None, {input_name: img3})
    # print(len(outputs))
    conv_sbbox=outputs[0]
    conv_mbbox=outputs[1]
    conv_lbbox=outputs[2]

    input_size=(416, 416)
    class_names = post_process.get_class(classes_path)
    decode_sbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[0], len(class_names),  input_size, conv_sbbox)
    decode_mbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[1], len(class_names),  input_size, conv_mbbox)
    decode_lbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[2], len(class_names),  input_size, conv_lbbox)
    output = np.concatenate([decode_sbbox, decode_mbbox, decode_lbbox], 1)
    print(decode_sbbox.shape, decode_mbbox.shape, decode_lbbox.shape, output.shape)

    batch_detections = post_process.non_max_suppression2(output, len(class_names), conf_thres=conf_thres, nms_thres=nms_thres)
    
    try:
        batch_detections = np.array(batch_detections[0])
        bbox_nums=np.array(batch_detections[0]).shape[0]
    except:
        print("没有检测结果！")
        exit()
        
    image = Image.open(image_path)
    boxes, top_conf, top_label=post_process.Regression(batch_detections, conf_thres, image, letterbox)
    post_process.draw_box(boxes, top_conf, top_label, class_names, image)

3.test_om2.py

import cv2
import numpy as np
import os
import colorsys
from PIL import Image, ImageDraw, ImageFont
import post_process2 as post_process


conf_thres=0.5
nms_thres=0.3
letterbox=False
anchors_path='data/dataset2/coco_anchors.names'
classes_path='data/dataset2/coins.names'


if __name__ == '__main__':
    img_path="data/img/test1.jpg"
    image = Image.open(img_path)
    img=post_process.get_imgges(image, letterbox)

    # model_path="data/model4/test.pth"
    # outputs=prediect(img)
    # conv_sbbox=outputs[0].detach().numpy()
    # conv_mbbox=outputs[1].detach().numpy()
    # conv_lbbox=outputs[2].detach().numpy()
    # np.save("data/test/conv_sbbox.npy", conv_sbbox)
    # np.save("data/test/conv_mbbox.npy", conv_mbbox)
    # np.save("data/test/conv_lbbox.npy", conv_lbbox)

    conv_sbbox=np.load("data/data2/conv_sbbox.npy")
    conv_mbbox=np.load("data/data2/conv_mbbox.npy")
    conv_lbbox=np.load("data/data2/conv_lbbox.npy")
    print(conv_sbbox.shape, conv_mbbox.shape, conv_lbbox.shape)

    
    input_size=(416, 416)
    class_names = post_process.get_class(classes_path)
    decode_sbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[0], len(class_names),  input_size, conv_sbbox)
    decode_mbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[1], len(class_names),  input_size, conv_mbbox)
    decode_lbbox=post_process.DecodeBox2(post_process.get_anchors(anchors_path)[2], len(class_names),  input_size, conv_lbbox)
    output = np.concatenate([decode_sbbox, decode_mbbox, decode_lbbox], 1)
    print(decode_sbbox.shape, decode_mbbox.shape, decode_lbbox.shape, output.shape)

    batch_detections = post_process.non_max_suppression2(output, len(class_names), conf_thres=conf_thres, nms_thres=nms_thres)
    print(batch_detections)
    try:
        batch_detections = np.array(batch_detections[0])
        bbox_nums=np.array(batch_detections[0]).shape[0]
    except:
        print("没有检测结果！")
        exit()

    boxes, top_conf, top_label=post_process.Regression(batch_detections, conf_thres, image, letterbox)
    post_process.draw_box(boxes, top_conf, top_label, class_names, image)

你可能感兴趣的:(深度学习,yolov4)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
推荐3家毕业AI论文可五分钟一键生成！文末附免费教程！小猪包333 写论文人工智能 AI写作深度学习计算机视觉
在当前的学术研究和写作领域，AI论文生成器已经成为许多研究人员和学生的重要工具。这些工具不仅能够帮助用户快速生成高质量的论文内容，还能进行内容优化、查重和排版等操作。以下是三款值得推荐的AI论文生成器：千笔-AIPassPaper、懒人论文以及AIPaperPass。千笔-AIPassPaper千笔-AIPassPaper是一款基于深度学习和自然语言处理技术的AI写作助手，旨在帮助用户快速生成高质
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
[实践应用] 深度学习之模型性能评估指标 YuanDaima2048 深度学习工具使用深度学习人工智能损失函数性能评估 pytorch python 机器学习
文章总览：YuanDaiMa2048博客文章总览深度学习之模型性能评估指标分类任务回归任务排序任务聚类任务生成任务其他介绍在机器学习和深度学习领域，评估模型性能是一项至关重要的任务。不同的学习任务需要不同的性能指标来衡量模型的有效性。以下是对一些常见任务及其相应的性能评估指标的详细解释和总结。分类任务分类任务是指模型需要将输入数据分配到预定义的类别或标签中。以下是分类任务中常用的性能指标：准确率(
[实践应用] 深度学习之优化器 YuanDaima2048 深度学习工具使用 pytorch 深度学习人工智能机器学习 python 优化器
文章总览：YuanDaiMa2048博客文章总览深度学习之优化器1.随机梯度下降（SGD）2.动量优化（Momentum）3.自适应梯度（Adagrad）4.自适应矩估计（Adam）5.RMSprop总结其他介绍在深度学习中，优化器用于更新模型的参数，以最小化损失函数。常见的优化函数有很多种，下面是几种主流的优化器及其特点、原理和PyTorch实现：1.随机梯度下降（SGD）原理:随机梯度下降通过
生成式地图制图 Bwywb_3 深度学习机器学习深度学习生成对抗网络
生成式地图制图（GenerativeCartography）是一种利用生成式算法和人工智能技术自动创建地图的技术。它结合了传统的地理信息系统（GIS）技术与现代生成模型（如深度学习、GANs等），能够根据输入的数据自动生成符合需求的地图。这种方法在城市规划、虚拟环境设计、游戏开发等多个领域具有应用前景。主要特点：自动化生成：通过算法和模型，系统能够根据输入的地理或空间数据自动生成地图，而无需人工逐
吴恩达深度学习笔记(30)-正则化的解释极客Array
正则化（Regularization）深度学习可能存在过拟合问题——高方差，有两个解决方法，一个是正则化，另一个是准备更多的数据，这是非常可靠的方法，但你可能无法时时刻刻准备足够多的训练数据或者获取更多数据的成本很高，但正则化通常有助于避免过拟合或减少你的网络误差。如果你怀疑神经网络过度拟合了数据，即存在高方差问题，那么最先想到的方法可能是正则化，另一个解决高方差的方法就是准备更多数据，这也是非常
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
深度学习-点击率预估-研究论文2024-09-14速读 sp_fyf_2024 深度学习人工智能
深度学习-点击率预估-研究论文2024-09-14速读1.DeepTargetSessionInterestNetworkforClick-ThroughRatePredictionHZhong,JMa,XDuan,SGu,JYao-2024InternationalJointConferenceonNeuralNetworks,2024深度目标会话兴趣网络用于点击率预测摘要：这篇文章提出了一种新
损失函数与反向传播 Star_. PyTorch pytorch 深度学习 python
损失函数定义与作用损失函数(lossfunction)在深度学习领域是用来计算搭建模型预测的输出值和真实值之间的误差。1.损失函数越小越好2.计算实际输出与目标之间的差距3.为更新输出提供依据（反向传播)常见的损失函数回归常见的损失函数有：均方差（MeanSquaredError，MSE）、平均绝对误差（MeanAbsoluteErrorLoss，MAE）、HuberLoss是一种将MSE与MAE
【深度学习】训练过程中一个OOM的问题，太难查了 weixin_40293999 深度学习深度学习人工智能
现象：各位大佬又遇到过ubuntu的这个问题么？现象是在训练过程中，ssh上不去了，能ping通，没死机，但是ubunutu的pc侧的显示器，鼠标啥都不好用了。只能重启。问题原因：OOM了95G，尼玛！！！！pytorch爆内存了，然后journald假死了，在journald被watchdog干掉之后，系统就崩溃了。这种规模的爆内存一般，即使被oomkill了，也要卡半天的，确实会这样，能不能配
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
深度学习-13-小语言模型之SmolLM的使用皮皮冰燃深度学习深度学习
文章附录1SmolLM概述1.1SmolLM简介1.2下载模型2运行2.1在CPU/GPU/多GPU上运行模型2.2使用torch.bfloat162.3通过位和字节的量化版本3应用示例4问题及解决4.1attention_mask和pad_token_id报错4.2max_new_tokens=205参考附录1SmolLM概述1.1SmolLM简介SmolLM是一系列尖端小型语言模型，提供三种规
基于深度学习的农作物病害检测 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的农作物病害检测利用卷积神经网络（CNN）、生成对抗网络（GAN）、Transformer等深度学习技术，自动识别和分类农作物的病害，帮助农业工作者提高作物管理效率、减少损失。1.农作物病害检测的挑战病害种类繁多：农作物病害的类型多样，不同病害在同一作物上的表现差异很大，同时同一种病害在不同生长阶段的症状也可能不同。环境影响：天气、光照、湿度等外部环境因素会影响农作物的表现，使得病害检
基于深度学习的文本引导的图像编辑 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的文本引导的图像编辑（Text-GuidedImageEditing）是一种通过自然语言文本指令对图像进行编辑或修改的技术。它结合了图像生成和自然语言处理（NLP）的最新进展，使用户能够通过描述性文本对图像内容进行精确的调整和操控。1.文本引导的图像编辑的挑战文本和图像之间的对齐：如何将文本中的语义信息准确地映射到图像中的特定区域或元素是一个关键挑战。这涉及到多模态数据的对齐和理解。编
深度学习--对抗生成网络（GAN, Generative Adversarial Network） Ambition_LAO 深度学习生成对抗网络
对抗生成网络（GAN,GenerativeAdversarialNetwork）是一种深度学习模型，由IanGoodfellow等人在2014年提出。GAN主要用于生成数据，通过两个神经网络相互对抗，来生成以假乱真的新数据。以下是对GAN的详细阐述，包括其概念、作用、核心要点、实现过程、代码实现和适用场景。1.概念GAN由两个神经网络组成：生成器（Generator）和判别器（Discrimina
深度学习：怎么看pth文件的参数奥利给少年深度学习人工智能
.pth文件是PyTorch模型的权重文件，它通常包含了训练好的模型的参数。要查看或使用这个文件，你可以按照以下步骤操作：1.确保你有模型的定义你需要有创建这个.pth文件时所用的模型的代码。这意味着你需要有模型的类定义和架构。2.加载模型权重使用PyTorch的load_state_dict方法来加载权重。这里是如何操作的：importtorchimporttorch.nnasnn#定义模型结构
chatgpt赋能python：如何在Python中安装Keras库？ turensu ChatGpt python chatgpt keras 计算机
如何在Python中安装Keras库？Keras是一个简单易用的神经网络库，由FrançoisChollet编写。它在Python编程语言中实现了深度学习的功能，可以使您更轻松地构建和试验不同类型的神经网络。如果您是一名Python开发人员，肯定会想知道如何在您的Python项目中安装Keras库。在本文中，我们将向您展示如何安装和配置Keras库。步骤1：安装Python要使用Keras库，您需
如何理解深度学习的训练过程奋斗的草莓熊深度学习人工智能 python scikit-learn virtualenv numpy pandas
文章目录1.训练是干什么？2.预训练模型进行训练，主要更改的是预训练模型的什么东西？1.训练是干什么？以yolov5为例子，训练的目的是把一组输入猫狗图像放到神经网络中，得到一个输出模型，这个模型下次可以直接用来识别哪个是猫，哪个是狗2.预训练模型进行训练，主要更改的是预训练模型的什么东西？超参数（Hyperparameters）：这是模型结构中定义的参数，比如：卷积核大小（kernel_size
Keras深度学习框架入门及实战指南司莹嫣Maude
Keras深度学习框架入门及实战指南keraskeras-team/keras:是一个基于Python的深度学习库，它没有使用数据库。适合用于深度学习任务的开发和实现，特别是对于需要使用Python深度学习库的场景。特点是深度学习库、Python、无数据库。项目地址:https://gitcode.com/gh_mirrors/ke/keras一、项目介绍Keras简介Keras是一款高级神经网络
深度学习驱动的车牌识别：技术演进与未来挑战逼子歌深度学习车牌识别神经网络字符识别 YOLO 卷积神经网络
一、引言1.1研究背景在当今社会，智能交通系统的发展日益重要，而车牌识别作为其关键组成部分，发挥着至关重要的作用。车牌识别技术广泛应用于交通管理、停车场管理、安防监控等领域。在交通管理中，它可以用于车辆识别、交通违法监控和车流统计等，提高交通管理的效率和准确性。在停车场管理中，实现车辆的自动识别和收费，提升管理和服务水平。在安防监控领域，可用于追踪嫌疑人及犯罪行为。深度学习的出现为车牌识别带来了重
每天五分钟玩转深度学习PyTorch：模型参数优化器torch.optim 幻风_huanfeng 深度学习框架pytorch 深度学习 pytorch 人工智能神经网络机器学习优化算法
本文重点在机器学习或者深度学习中，我们需要通过修改参数使得损失函数最小化(或最大化)，优化算法就是一种调整模型参数更新的策略。在pytorch中定义了优化器optim，我们可以使用它调用封装好的优化算法，然后传递给它神经网络模型参数，就可以对模型进行优化。本文是学习第6步(优化器)，参考链接pytorch的学习路线随机梯度下降算法在深度学习和机器学习中，梯度下降算法是最常用的参数更新方法，它的公式
什么是AIGC？有哪些免费工具？ chent_某位 AIGC
AIGC（AIGeneratedContent），即“人工智能生成内容”，是指通过人工智能技术自动生成各种类型的数字内容。AIGC让机器能够根据输入的信息或数据生成符合人类需求的文本、图像、音频、视频等内容，极大提高了内容创作的效率。AIGC的背景与起源随着深度学习和自然语言处理技术的快速发展，人工智能已经不再局限于简单的任务，如分类、预测和数据分析，而是具备了生成内容的能力。生成式AI模型，如O
transformer架构(Transformer Architecture)原理与代码实战案例讲解 AI架构设计之禅大数据AI人工智能 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
transformer架构(TransformerArchitecture)原理与代码实战案例讲解关键词：Transformer,自注意力机制,编码器-解码器,预训练,微调,NLP,机器翻译作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来自然语言处理（NLP）领域的发展经历了从规则驱动到统计驱动再到深度学习驱动的三个阶段。
如何有效的学习AI大模型？ Python程序员罗宾学习人工智能语言模型自然语言处理架构
学习AI大模型是一个系统性的过程，涉及到多个学科的知识。以下是一些建议，帮助你更有效地学习AI大模型：基础知识储备：数学基础：学习线性代数、概率论、统计学和微积分等，这些是理解机器学习算法的数学基础。编程技能：掌握至少一种编程语言，如Python，因为大多数AI模型都是用Python实现的。理论学习：机器学习基础：了解监督学习、非监督学习、强化学习等基本概念。深度学习：学习神经网络的基本结构，如卷
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程牙牙要健康深度学习 onnx onnxruntime 深度学习 python 人工智能
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程提示:博主取舍了很多大佬的博文并亲测有效,分享笔记邀大家共同学习讨论文章目录【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程前言模型转换--pytorch转onnxWindows平台搭建依赖环境onnxruntime调用onnx模型ONNXRuntime推理核
基于深度学习的多模态信息检索 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的多模态信息检索（MultimodalInformationRetrieval,MMIR）是指利用深度学习技术，从包含多种模态（如文本、图像、视频、音频等）的数据集中检索出满足用户查询意图的相关信息。这种方法不仅可以处理单一模态的数据，还可以在多种模态之间建立关联，从而更准确地满足用户需求。1.多模态信息检索的挑战异构数据表示：多模态数据通常具有不同的特征和表示形式（如文本的词嵌入与图
SQL的各种连接查询 xieke90 UNION ALL UNION 外连接内连接 JOIN
一、内连接概念：内连接就是使用比较运算符根据每个表共有的列的值匹配两个表中的行。内连接（join 或者inner join ） SQL语法： select * fron
java编程思想--复用类百合不是茶 java 继承代理组合 final类
复用类看着标题都不知道是什么,再加上java编程思想翻译的比价难懂,所以知道现在才看这本软件界的奇书一:组合语法:就是将对象的引用放到新类中即可代码: package com.wj.reuse; /** * * @author Administrator 组
[开源与生态系统]国产CPU的生态系统 comsci cpu
计算机要从娃娃抓起...而孩子最喜欢玩游戏.... 要让国产CPU在国内市场形成自己的生态系统和产业链,国家和企业就不能够忘记游戏这个非常关键的环节.... 投入一些资金和资源,人力和政策,让游
JVM内存区域划分Eden Space、Survivor Space、Tenured Gen，Perm Gen解释商人shang jvm内存
jvm区域总体分两类，heap区和非heap区。heap区又分：Eden Space（伊甸园）、Survivor Space(幸存者区)、Tenured Gen（老年代-养老区）。非heap区又分：Code Cache(代码缓存区)、Perm Gen（永久代）、Jvm Stack(java虚拟机栈)、Local Method Statck(本地方法栈)。 HotSpot虚拟机GC算法采用分代收
页面上调用 QQ oloz qq
<A href="tencent://message/?uin=707321921&Site=有事Q我&Menu=yes"> <img style="border:0px;" src=http://wpa.qq.com/pa?p=1:707321921:1></a>
一些问题文强chu 问题
1.eclipse 导出 doc 出现“The Javadoc command does not exist.” javadoc command 选择 jdk/bin/javadoc.exe 2.tomcate 配置 web 项目 ..... SQL:3.mysql * 必须得放前面否则 select&nbs
生活没有安全感小桔子生活孤独安全感
圈子好小，身边朋友没几个，交心的更是少之又少。在深圳，除了男朋友，没几个亲密的人。不知不觉男朋友成了唯一的依靠，毫不夸张的说，业余生活的全部。现在感情好，也很幸福的。但是说不准难免人心会变嘛，不发生什么大家都乐融融，发生什么很难处理。我想说如果不幸被分手(无论原因如何)，生活难免变化很大，在深圳，我没交心的朋友。明
php 基础语法 aichenglong php 基本语法
1 .1 php变量必须以$开头 <?php $a=” b”; echo ?> 1 .2 php基本数据库类型 Integer float/double Boolean string 1 .3 复合数据类型数组array和对象 object 1 .4 特殊数据类型 null 资源类型(resource) $co
mybatis tools 配置详解 AILIKES mybatis
MyBatis Generator中文文档 MyBatis Generator中文文档地址： http://generator.sturgeon.mopaas.com/ 该中文文档由于尽可能和原文内容一致，所以有些地方如果不熟悉，看中文版的文档的也会有一定的障碍，所以本章根据该中文文档以及实际应用，使用通俗的语言来讲解详细的配置。本文使用Markdown进行编辑，但是博客显示效
继承与多态的探讨百合不是茶 JAVA面向对象继承对象
继承 extends 多态继承是面向对象最经常使用的特征之一：继承语法是通过继承发、基类的域和方法 //继承就是从现有的类中生成一个新的类，这个新类拥有现有类的所有extends是使用继承的关键字：在A类中定义属性和方法； class A{ //定义属性 int age； //定义方法 public void go
JS的undefined与null的实例 bijian1013 JavaScript JavaScript
<form name="theform" id="theform"> </form> <script language="javascript"> var a alert(typeof(b)); //这里提示undefined if(theform.datas
TDD实践（一） bijian1013 java 敏捷 TDD
一.TDD概述 TDD：测试驱动开发，它的基本思想就是在开发功能代码之前，先编写测试代码。也就是说在明确要开发某个功能后，首先思考如何对这个功能进行测试，并完成测试代码的编写，然后编写相关的代码满足这些测试用例。然后循环进行添加其他功能，直到完全部功能的开发。
[Maven学习笔记十]Maven Profile与资源文件过滤器 bit1129 maven
什么是Maven Profile Maven Profile的含义是针对编译打包环境和编译打包目的配置定制，可以在不同的环境上选择相应的配置，例如DB信息，可以根据是为开发环境编译打包，还是为生产环境编译打包，动态的选择正确的DB配置信息 Profile的激活机制 1.Profile可以手工激活，比如在Intellij Idea的Maven Project视图中可以选择一个P
【Hive八】Hive用户自定义生成表函数(UDTF) bit1129 hive
1. 什么是UDTF UDTF，是User Defined Table-Generating Functions，一眼看上去，貌似是用户自定义生成表函数，这个生成表不应该理解为生成了一个HQL Table，貌似更应该理解为生成了类似关系表的二维行数据集 2. 如何实现UDTF 继承org.apache.hadoop.hive.ql.udf.generic
tfs restful api 加auth 2.0认计 ronin47
　　目前思考如何给tfs的ngx-tfs api增加安全性。有如下两点：　　一是基于客户端的ip设置。这个比较容易实现。　　二是基于OAuth2.0认证，这个需要lua，实现起来相对于一来说，有些难度。　　现在重点介绍第二种方法实现思路。　　前言：我们使用Nginx的Lua中间件建立了OAuth2认证和授权层。如果你也有此打算，阅读下面的文档，实现自动化并获得收益。SeatGe
jdk环境变量配置 byalias java jdk
进行java开发，首先要安装jdk，安装了jdk后还要进行环境变量配置： 1、下载jdk（http://java.sun.com/javase/downloads/index.jsp），我下载的版本是：jdk-7u79-windows-x64.exe 2、安装jdk-7u79-windows-x64.exe 3、配置环境变量：右击"计算机"-->&quo
《代码大全》表驱动法-Table Driven Approach-2 bylijinnan java
package com.ljn.base; import java.io.BufferedReader; import java.io.FileInputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Collections; import java.uti
SQL 数值四舍五入小数点后保留2位 chicony 四舍五入
1.round() 函数是四舍五入用，第一个参数是我们要被操作的数据，第二个参数是设置我们四舍五入之后小数点后显示几位。 2.numeric 函数的2个参数，第一个表示数据长度，第二个参数表示小数点后位数。例如：　　select cast(round(12.5,2) as numeric(5,2))
c++运算符重载 CrazyMizzz C++
一、加+，减-，乘*，除/ 的运算符重载 Rational operator*(const Rational &x) const{ return Rational(x.a * this->a); } 在这里只写乘法的，加减除的写法类似二、<<输出,>>输入的运算符重载 &nb
hive DDL语法汇总 daizj hive 修改列 DDL 修改表
hive DDL语法汇总１、对表重命名 hive> ALTER TABLE table_name RENAME TO new_table_name; 2、修改表备注 hive> ALTER TABLE table_name SET TBLPROPERTIES ('comment' = new_comm
jbox使用说明 dcj3sjt126com Web
参考网址：http://www.kudystudio.com/jbox/jbox-demo.html jBox v2.3 beta [ 点击下载] 技术交流QQGroup：172543951 100521167 [2011-11-11] jBox v2.3 正式版 - [调整&修复] IE6下有iframe或页面有active、applet控件
UISegmentedControl 开发笔记 dcj3sjt126com
// typedef NS_ENUM(NSInteger, UISegmentedControlStyle) { // UISegmentedControlStylePlain, // large plain &
Slick生成表映射文件 ekian scala
Scala添加SLICK进行数据库操作，需在sbt文件上添加slick-codegen包 "com.typesafe.slick" %% "slick-codegen" % slickVersion 因为我是连接SQL Server数据库，还需添加slick-extensions，jtds包 "com.typesa
ES-TEST gengzg test
package com.MarkNum; import java.io.IOException; import java.util.Date; import java.util.HashMap; import java.util.Map; import javax.servlet.ServletException; import javax.servlet.annotation
为何外键不再推荐使用 hugh.wang mysql DB
表的关联，是一种逻辑关系，并不需要进行物理上的“硬关联”，而且你所期望的关联，其实只是其数据上存在一定的联系而已，而这种联系实际上是在设计之初就定义好的固有逻辑。在业务代码中实现的时候，只要按照设计之初的这种固有关联逻辑来处理数据即可，并不需要在数据库层面进行“硬关联”，因为在数据库层面通过使用外键的方式进行“硬关联”，会带来很多额外的资源消耗来进行一致性和完整性校验，即使很多时候我们并不
领域驱动设计 julyflame VO DAO 设计模式 DTO po
概念： VO（View Object）：视图对象，用于展示层，它的作用是把某个指定页面（或组件）的所有数据封装起来。 DTO（Data Transfer Object）：数据传输对象，这个概念来源于J2EE的设计模式，原来的目的是为了EJB的分布式应用提供粗粒度的数据实体，以减少分布式调用的次数，从而提高分布式调用的性能和降低网络负载，但在这里，我泛指用于展示层与服务层之间的数据传输对
单例设计模式 hm4123660 java Singleton 单例设计模式懒汉式饿汉式
单例模式是一种常用的软件设计模式。在它的核心结构中只包含一个被称为单例类的特殊类。通过单例模式可以保证系统中一个类只有一个实例而且该实例易于外界访问，从而方便对实例个数的控制并节约系统源。如果希望在系统中某个类的对象只能存在一个，单例模式是最好的解决方案。 &nb
logback zhb8015 log logback
一、logback的介绍 Logback是由log4j创始人设计的又一个开源日志组件。logback当前分成三个模块：logback-core,logback- classic和logback-access。logback-core是其它两个模块的基础模块。logback-classic是log4j的一个改良版本。此外logback-class
整合Kafka到Spark Streaming——代码示例和挑战 Stark_Summer spark storm zookeeper PARALLELISM processing
作者Michael G. Noll是瑞士的一位工程师和研究员，效力于Verisign，是Verisign实验室的大规模数据分析基础设施（基础Hadoop）的技术主管。本文，Michael详细的演示了如何将Kafka整合到Spark Streaming中。期间， Michael还提到了将Kafka整合到 Spark Streaming中的一些现状，非常值得阅读，虽然有一些信息在Spark 1.2版
spring-master-slave-commondao 王新春 DAO spring dataSource slave master
互联网的web项目，都有个特点：请求的并发量高，其中请求最耗时的db操作，又是系统优化的重中之重。为此，往往搭建 db的一主多从库的数据库架构。作为web的DAO层，要保证针对主库进行写操作，对多个从库进行读操作。当然在一些请求中，为了避免主从复制的延迟导致的数据不一致性，部分的读操作也要到主库上。（这种需求一般通过业务垂直分开，比如下单业务的代码所部署的机器，读去应该也要从主库读取数