沙皮狗de忧伤

yolov3_tiny.onnx转trt采用tensorrt加速模型推理

既然上一篇博客都把yolov3-tiny.weights转onnx做了，推理也测了。那么呢，就再直接转个trt模型吧。这样感觉博客的内容就更加连贯了吧，实用性貌似会更加强吧。

（如果没看过yolov3-tiny转onnx这篇博客的，请点这，带你飞过去）

这篇博客的内容是接着上一篇博客写的，所以呢，这里就直接进入主题，上代码！！

本文的目录：

onnx模型转trt文件（yolov3-tiny.onnx）；
实用tensorrt进行推理；
测试推理效果（视频文件）；

首先呢，你得要安装好tensorrt，至于怎么安装很多博客都有介绍，在本篇博客中使用的tensorrt版本为：

Python 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 21:41:56) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorrt as trt
>>> print(trt.__version__)
6.0.1.5

测试的硬件配置：

GTX2060（laptop）-6G
i7-9750H
16G-DDR4 2666MHz

一. onnx模型转trt文件

直接上代码，运行就完事（代码均可将yolov3.onnx、yolov3-tiny.onnx转为trt文件）
onnx_to_trt.py

# -*-coding: utf-8-*-
# author: HXY
# 2019-12-24
"""
tensorrt6.0
"""

import os
import tensorrt as trt

TRT_LOGGER = trt.Logger()


class GetTrt(object):
    def __init__(self, onnx_file_path, trt_save_path):
        self.onnx_file_path = onnx_file_path
        self.batch_size = 1
        self.fp16_on = True
        self.trt_save_path = trt_save_path

    def build_engine(self):
        with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network,
                                                                                                     TRT_LOGGER) as parser:
            builder.max_workspace_size = 1 << 30  # 30:1GB; 28:256MiB
            builder.max_batch_size = 1
            builder.fp16_mode = self.fp16_on
            if not os.path.exists(self.onnx_file_path):
                print("Onnx file not found")
                exit(0)
            print("loading onnx file from path {}".format(self.onnx_file_path))
            with open(self.onnx_file_path, 'rb') as model:
                parser.parse(model.read())
            print("Completed parsing of onnx file....")
            print("building an engine..this may take a while....")
            # network.get_input(0).shape = [1, 3, 416, 416]
            engine = builder.build_cuda_engine(network)
            with open(self.trt_save_path, 'wb') as f:
                f.write(engine.serialize())
            print("create engine completed")


"""
test function
"""
if __name__ == '__main__':
    test = GetTrt('./yolov3-tiny.onnx',
                  './yolov3-tiny.trt')
    test.build_engine()

你运行成功后，你就可以得到一份trt文件，这个时候你就可以开始进行推理测试啦。
这里贴一张运行成功的截图（额，有个警告，请忽略哈哈）：

二.使用tensorrt进行推理

既然进行推理，那还是需要一点代码的，这些代码建议放在一个文件夹内，供后面推理时调用；

首先是common.py ,这些代码，用就完事了，喜欢的朋友可以自行研究研究！

#
# Copyright 1993-2019 NVIDIA Corporation.  All rights reserved.
#
# NOTICE TO LICENSEE:
#
# This source code and/or documentation ("Licensed Deliverables") are
# subject to NVIDIA intellectual property rights under U.S. and
# international Copyright laws.
#
# These Licensed Deliverables contained herein is PROPRIETARY and
# CONFIDENTIAL to NVIDIA and is being provided under the terms and
# conditions of a form of NVIDIA software license agreement by and
# between NVIDIA and Licensee ("License Agreement") or electronically
# accepted by Licensee.  Notwithstanding any terms or conditions to
# the contrary in the License Agreement, reproduction or disclosure
# of the Licensed Deliverables to any third party without the express
# written consent of NVIDIA is prohibited.
#
# NOTWITHSTANDING ANY TERMS OR CONDITIONS TO THE CONTRARY IN THE
# LICENSE AGREEMENT, NVIDIA MAKES NO REPRESENTATION ABOUT THE
# SUITABILITY OF THESE LICENSED DELIVERABLES FOR ANY PURPOSE.  IT IS
# PROVIDED "AS IS" WITHOUT EXPRESS OR IMPLIED WARRANTY OF ANY KIND.
# NVIDIA DISCLAIMS ALL WARRANTIES WITH REGARD TO THESE LICENSED
# DELIVERABLES, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY,
# NONINFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE.
# NOTWITHSTANDING ANY TERMS OR CONDITIONS TO THE CONTRARY IN THE
# LICENSE AGREEMENT, IN NO EVENT SHALL NVIDIA BE LIABLE FOR ANY
# SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, OR ANY
# DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
# WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
# ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
# OF THESE LICENSED DELIVERABLES.
#
# U.S. Government End Users.  These Licensed Deliverables are a
# "commercial item" as that term is defined at 48 C.F.R. 2.101 (OCT
# 1995), consisting of "commercial computer software" and "commercial
# computer software documentation" as such terms are used in 48
# C.F.R. 12.212 (SEPT 1995) and is provided to the U.S. Government
# only as a commercial end item.  Consistent with 48 C.F.R.12.212 and
# 48 C.F.R. 227.7202-1 through 227.7202-4 (JUNE 1995), all
# U.S. Government End Users acquire the Licensed Deliverables with
# only those rights set forth herein.
#
# Any use of the Licensed Deliverables in individual and commercial
# software must include, in the user documentation and internal
# comments to the code, the above Disclaimer and U.S. Government End
# Users Notice.
#

from itertools import chain
import argparse
import os

import pycuda.driver as cuda
import pycuda.autoinit
import numpy as np

import tensorrt as trt

try:
    # Sometimes python2 does not understand FileNotFoundError
    FileNotFoundError
except NameError:
    FileNotFoundError = IOError


def GiB(val):
    return val * 1 << 30


def find_sample_data(description="Runs a TensorRT Python sample", subfolder="", find_files=[]):
    '''
    Parses sample arguments.

    Args:
        description (str): Description of the sample.
        subfolder (str): The subfolder containing data relevant to this sample
        find_files (str): A list of filenames to find. Each filename will be replaced with an absolute path.

    Returns:
        str: Path of data directory.
    '''

    # Standard command-line arguments for all samples.
    kDEFAULT_DATA_ROOT = os.path.join(os.sep, "usr", "src", "tensorrt", "data")
    parser = argparse.ArgumentParser(description=description, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument("-d", "--datadir",
                        help="Location of the TensorRT sample data directory, and any additional data directories.",
                        action="append", default=[kDEFAULT_DATA_ROOT])
    args, _ = parser.parse_known_args()

    def get_data_path(data_dir):
        # If the subfolder exists, append it to the path, otherwise use the provided path as-is.
        data_path = os.path.join(data_dir, subfolder)
        if not os.path.exists(data_path):
            print("WARNING: " + data_path + " does not exist. Trying " + data_dir + " instead.")
            data_path = data_dir
        # Make sure data directory exists.
        if not (os.path.exists(data_path)):
            print("WARNING: {:} does not exist. Please provide the correct data path with the -d option.".format(
                data_path))
        return data_path

    data_paths = [get_data_path(data_dir) for data_dir in args.datadir]
    return data_paths, locate_files(data_paths, find_files)


def locate_files(data_paths, filenames):
    """
    Locates the specified files in the specified data directories.
    If a file exists in multiple data directories, the first directory is used.

    Args:
        data_paths (List[str]): The data directories.
        filename (List[str]): The names of the files to find.

    Returns:
        List[str]: The absolute paths of the files.

    Raises:
        FileNotFoundError if a file could not be located.
    """
    found_files = [None] * len(filenames)
    for data_path in data_paths:
        # Find all requested files.
        for index, (found, filename) in enumerate(zip(found_files, filenames)):
            if not found:
                file_path = os.path.abspath(os.path.join(data_path, filename))
                if os.path.exists(file_path):
                    found_files[index] = file_path

    # Check that all files were found
    for f, filename in zip(found_files, filenames):
        if not f or not os.path.exists(f):
            raise FileNotFoundError("Could not find {:}. Searched in data paths: {:}".format(filename, data_paths))
    return found_files


# Simple helper data class that's a little nicer to use than a 2-tuple.
class HostDeviceMem(object):
    def __init__(self, host_mem, device_mem):
        self.host = host_mem
        self.device = device_mem

    def __str__(self):
        return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)

    def __repr__(self):
        return self.__str__()


# Allocates all buffers required for an engine, i.e. host/device inputs/outputs.
def allocate_buffers(engine):
    inputs = []
    outputs = []
    bindings = []
    stream = cuda.Stream()
    for binding in engine:
        size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
        dtype = trt.nptype(engine.get_binding_dtype(binding))
        # Allocate host and device buffers
        host_mem = cuda.pagelocked_empty(size, dtype)
        device_mem = cuda.mem_alloc(host_mem.nbytes)
        # Append the device buffer to device bindings.
        bindings.append(int(device_mem))
        # Append to the appropriate list.
        if engine.binding_is_input(binding):
            inputs.append(HostDeviceMem(host_mem, device_mem))
        else:
            outputs.append(HostDeviceMem(host_mem, device_mem))
    return inputs, outputs, bindings, stream


# This function is generalized for multiple inputs/outputs.
# inputs and outputs are expected to be lists of HostDeviceMem objects.
def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
    # Transfer input data to the GPU.
    [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
    # Run inference.
    context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
    # Transfer predictions back from the GPU.
    [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
    # Synchronize the stream
    stream.synchronize()
    # Return only the host outputs.
    return [out.host for out in outputs]

数据预处理部分代码：

data_processing.py

#
# Copyright 1993-2019 NVIDIA Corporation.  All rights reserved.
#
# NOTICE TO LICENSEE:
#
# This source code and/or documentation ("Licensed Deliverables") are
# subject to NVIDIA intellectual property rights under U.S. and
# international Copyright laws.
#
# These Licensed Deliverables contained herein is PROPRIETARY and
# CONFIDENTIAL to NVIDIA and is being provided under the terms and
# conditions of a form of NVIDIA software license agreement by and
# between NVIDIA and Licensee ("License Agreement") or electronically
# accepted by Licensee.  Notwithstanding any terms or conditions to
# the contrary in the License Agreement, reproduction or disclosure
# of the Licensed Deliverables to any third party without the express
# written consent of NVIDIA is prohibited.
#
# NOTWITHSTANDING ANY TERMS OR CONDITIONS TO THE CONTRARY IN THE
# LICENSE AGREEMENT, NVIDIA MAKES NO REPRESENTATION ABOUT THE
# SUITABILITY OF THESE LICENSED DELIVERABLES FOR ANY PURPOSE.  IT IS
# PROVIDED "AS IS" WITHOUT EXPRESS OR IMPLIED WARRANTY OF ANY KIND.
# NVIDIA DISCLAIMS ALL WARRANTIES WITH REGARD TO THESE LICENSED
# DELIVERABLES, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY,
# NONINFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE.
# NOTWITHSTANDING ANY TERMS OR CONDITIONS TO THE CONTRARY IN THE
# LICENSE AGREEMENT, IN NO EVENT SHALL NVIDIA BE LIABLE FOR ANY
# SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, OR ANY
# DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
# WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
# ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
# OF THESE LICENSED DELIVERABLES.
#
# U.S. Government End Users.  These Licensed Deliverables are a
# "commercial item" as that term is defined at 48 C.F.R. 2.101 (OCT
# 1995), consisting of "commercial computer software" and "commercial
# computer software documentation" as such terms are used in 48
# C.F.R. 12.212 (SEPT 1995) and is provided to the U.S. Government
# only as a commercial end item.  Consistent with 48 C.F.R.12.212 and
# 48 C.F.R. 227.7202-1 through 227.7202-4 (JUNE 1995), all
# U.S. Government End Users acquire the Licensed Deliverables with
# only those rights set forth herein.
#
# Any use of the Licensed Deliverables in individual and commercial
# software must include, in the user documentation and internal
# comments to the code, the above Disclaimer and U.S. Government End
# Users Notice.
#
# -*-coding: utf-8-*-
import math
from PIL import Image
import numpy as np
import os


# YOLOv3-608 has been trained with these 80 categories from COCO:
# Lin, Tsung-Yi, et al. "Microsoft COCO: Common Objects in Context."
# European Conference on Computer Vision. Springer, Cham, 2014.

def load_label_categories(label_file_path):
    categories = [line.rstrip('\n') for line in open(label_file_path)]
    return categories


LABEL_FILE_PATH = os.path.join(os.path.dirname(os.path.realpath(__file__)), '../config/lxz.txt')
ALL_CATEGORIES = load_label_categories(LABEL_FILE_PATH)

# Let's make sure that there are 80 classes, as expected for the COCO data set:
CATEGORY_NUM = len(ALL_CATEGORIES)
assert CATEGORY_NUM == 3


# 增加视频处理函数
class VideoPreprocessYOLO(object):
    """A simple class for loading images with PIL and reshaping them to the specified
    input resolution for YOLOv3-608.
    """

    def __init__(self, yolo_input_resolution):
        """Initialize with the input resolution for YOLOv3, which will stay fixed in this sample.

        Keyword arguments:
        yolo_input_resolution -- two-dimensional tuple with the target network's (spatial)
        input resolution in HW order
        """
        self.yolo_input_resolution = yolo_input_resolution

    def process(self, input_image_raw):
        """Load an image from the specified input path,
        and return it together with a pre-processed version required for feeding it into a
        YOLOv3 network.

        Keyword arguments:
        input_image_path -- string path of the image to be loaded
        """
        image_raw, image_resized = self._load_and_resize(input_image_raw)
        image_preprocessed = self._shuffle_and_normalize(image_resized)
        return image_raw, image_preprocessed

    def _load_and_resize(self, input_image_raw):
        """Load an image from the specified path and resize it to the input resolution.
        Return the input image before resizing as a PIL Image (required for visualization),
        and the resized image as a NumPy float array.

        Keyword arguments:
        input_image_path -- string path of the image to be loaded
        """

        image_raw = input_image_raw
        # Expecting yolo_input_resolution in (height, width) format, adjusting to PIL
        # convention (width, height) in PIL:
        new_resolution = (
            self.yolo_input_resolution[1],
            self.yolo_input_resolution[0])
        image_resized = image_raw.resize(
            new_resolution, resample=Image.BICUBIC)
        image_resized = np.array(image_resized, dtype=np.float32, order='C')
        return image_raw, image_resized

    def _shuffle_and_normalize(self, image):
        """Normalize a NumPy array representing an image to the range [0, 1], and
        convert it from HWC format ("channels last") to NCHW format ("channels first"
        with leading batch dimension).

        Keyword arguments:
        image -- image as three-dimensional NumPy float array, in HWC format
        """
        image /= 255.0
        # HWC to CHW format:
        image = np.transpose(image, [2, 0, 1])
        # CHW to NCHW format
        image = np.expand_dims(image, axis=0)
        # Convert the image to row-major order, also known as "C order":
        image = np.array(image, dtype=np.float32, order='C')
        return image


class PreprocessYOLO(object):
    """A simple class for loading images with PIL and reshaping them to the specified
    input resolution for YOLOv3-608.
    """

    def __init__(self, yolo_input_resolution):
        """Initialize with the input resolution for YOLOv3, which will stay fixed in this sample.

        Keyword arguments:
        yolo_input_resolution -- two-dimensional tuple with the target network's (spatial)
        input resolution in HW order
        """
        self.yolo_input_resolution = yolo_input_resolution

    def process(self, input_image_path):
        """Load an image from the specified input path,
        and return it together with a pre-processed version required for feeding it into a
        YOLOv3 network.

        Keyword arguments:
        input_image_path -- string path of the image to be loaded
        """
        image_raw, image_resized = self._load_and_resize(input_image_path)
        image_preprocessed = self._shuffle_and_normalize(image_resized)
        return image_raw, image_preprocessed

    def _load_and_resize(self, input_image_path):
        """Load an image from the specified path and resize it to the input resolution.
        Return the input image before resizing as a PIL Image (required for visualization),
        and the resized image as a NumPy float array.

        Keyword arguments:
        input_image_path -- string path of the image to be loaded
        """

        image_raw = Image.open(input_image_path)
        # Expecting yolo_input_resolution in (height, width) format, adjusting to PIL
        # convention (width, height) in PIL:
        new_resolution = (
            self.yolo_input_resolution[1],
            self.yolo_input_resolution[0])
        image_resized = image_raw.resize(
            new_resolution, resample=Image.BICUBIC)
        image_resized = np.array(image_resized, dtype=np.float32, order='C')
        return image_raw, image_resized

    def _shuffle_and_normalize(self, image):
        """Normalize a NumPy array representing an image to the range [0, 1], and
        convert it from HWC format ("channels last") to NCHW format ("channels first"
        with leading batch dimension).

        Keyword arguments:
        image -- image as three-dimensional NumPy float array, in HWC format
        """
        image /= 255.0
        # HWC to CHW format:
        image = np.transpose(image, [2, 0, 1])
        # CHW to NCHW format
        image = np.expand_dims(image, axis=0)
        # Convert the image to row-major order, also known as "C order":
        image = np.array(image, dtype=np.float32, order='C')
        return image


class PostprocessYOLO(object):
    """Class for post-processing the three outputs tensors from YOLOv3-608."""

    def __init__(self,
                 yolo_masks,
                 yolo_anchors,
                 obj_threshold,
                 nms_threshold,
                 yolo_input_resolution):
        """Initialize with all values that will be kept when processing several frames.
        Assuming 3 outputs of the network in the case of (large) YOLOv3.

        Keyword arguments:
        yolo_masks -- a list of 3 three-dimensional tuples for the YOLO masks
        yolo_anchors -- a list of 9 two-dimensional tuples for the YOLO anchors
        object_threshold -- threshold for object coverage, float value between 0 and 1
        nms_threshold -- threshold for non-max suppression algorithm,
        float value between 0 and 1
        input_resolution_yolo -- two-dimensional tuple with the target network's (spatial)
        input resolution in HW order
        """
        self.masks = yolo_masks
        self.anchors = yolo_anchors
        self.object_threshold = obj_threshold
        self.nms_threshold = nms_threshold
        self.input_resolution_yolo = yolo_input_resolution

    def process(self, outputs, resolution_raw):
        """Take the YOLOv3 outputs generated from a TensorRT forward pass, post-process them
        and return a list of bounding boxes for detected object together with their category
        and their confidences in separate lists.

        Keyword arguments:
        outputs -- outputs from a TensorRT engine in NCHW format
        resolution_raw -- the original spatial resolution from the input PIL image in WH order
        """
        outputs_reshaped = list()
        for output in outputs:
            outputs_reshaped.append(self._reshape_output(output))

        boxes, categories, confidences = self._process_yolo_output(
            outputs_reshaped, resolution_raw)

        return boxes, categories, confidences

    def _reshape_output(self, output):
        """Reshape a TensorRT output from NCHW to NHWC format (with expected C=255),
        and then return it in (height,width,3,85) dimensionality after further reshaping.

        Keyword argument:
        output -- an output from a TensorRT engine after inference
        """
        output = np.transpose(output, [0, 2, 3, 1])
        _, height, width, _ = output.shape
        dim1, dim2 = height, width
        dim3 = 3
        # There are CATEGORY_NUM=80 object categories:
        dim4 = (4 + 1 + CATEGORY_NUM)
        return np.reshape(output, (dim1, dim2, dim3, dim4))

    def _process_yolo_output(self, outputs_reshaped, resolution_raw):
        """Take in a list of three reshaped YOLO outputs in (height,width,3,85) shape and return
        return a list of bounding boxes for detected object together with their category and their
        confidences in separate lists.

        Keyword arguments:
        outputs_reshaped -- list of three reshaped YOLO outputs as NumPy arrays
        with shape (height,width,3,85)
        resolution_raw -- the original spatial resolution from the input PIL image in WH order
        """

        # E.g. in YOLOv3-608, there are three output tensors, which we associate with their
        # respective masks. Then we iterate through all output-mask pairs and generate candidates
        # for bounding boxes, their corresponding category predictions and their confidences:
        boxes, categories, confidences = list(), list(), list()
        for output, mask in zip(outputs_reshaped, self.masks):
            box, category, confidence = self._process_feats(output, mask)
            box, category, confidence = self._filter_boxes(box, category, confidence)
            boxes.append(box)
            categories.append(category)
            confidences.append(confidence)

        boxes = np.concatenate(boxes)
        categories = np.concatenate(categories)
        confidences = np.concatenate(confidences)

        # Scale boxes back to original image shape:
        width, height = resolution_raw
        image_dims = [width, height, width, height]
        boxes = boxes * image_dims

        # Using the candidates from the previous (loop) step, we apply the non-max suppression
        # algorithm that clusters adjacent bounding boxes to a single bounding box:
        nms_boxes, nms_categories, nscores = list(), list(), list()
        for category in set(categories):
            idxs = np.where(categories == category)
            box = boxes[idxs]
            category = categories[idxs]
            confidence = confidences[idxs]

            keep = self._nms_boxes(box, confidence)

            nms_boxes.append(box[keep])
            nms_categories.append(category[keep])
            nscores.append(confidence[keep])

        if not nms_categories and not nscores:
            return None, None, None

        boxes = np.concatenate(nms_boxes)
        categories = np.concatenate(nms_categories)
        confidences = np.concatenate(nscores)

        return boxes, categories, confidences

    def _process_feats(self, output_reshaped, mask):
        """Take in a reshaped YOLO output in height,width,3,85 format together with its
        corresponding YOLO mask and return the detected bounding boxes, the confidence,
        and the class probability in each cell/pixel.

        Keyword arguments:
        output_reshaped -- reshaped YOLO output as NumPy arrays with shape (height,width,3,85)
        mask -- 2-dimensional tuple with mask specification for this output
        """

        # Two in-line functions required for calculating the bounding box
        # descriptors:
        def sigmoid(value):
            """Return the sigmoid of the input."""
            return 1.0 / (1.0 + math.exp(-value))

        def exponential(value):
            """Return the exponential of the input."""
            return math.exp(value)

        # Vectorized calculation of above two functions:
        sigmoid_v = np.vectorize(sigmoid)
        exponential_v = np.vectorize(exponential)

        grid_h, grid_w, _, _ = output_reshaped.shape

        anchors = [self.anchors[i] for i in mask]

        # Reshape to N, height, width, num_anchors, box_params:
        anchors_tensor = np.reshape(anchors, [1, 1, len(anchors), 2])
        box_xy = sigmoid_v(output_reshaped[..., :2])
        box_wh = exponential_v(output_reshaped[..., 2:4]) * anchors_tensor
        box_confidence = sigmoid_v(output_reshaped[..., 4])

        box_confidence = np.expand_dims(box_confidence, axis=-1)
        box_class_probs = sigmoid_v(output_reshaped[..., 5:])

        col = np.tile(np.arange(0, grid_w), grid_w).reshape(-1, grid_w)
        row = np.tile(np.arange(0, grid_h).reshape(-1, 1), grid_h)

        col = col.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
        row = row.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
        grid = np.concatenate((col, row), axis=-1)

        box_xy += grid
        box_xy /= (grid_w, grid_h)
        box_wh /= self.input_resolution_yolo
        box_xy -= (box_wh / 2.)
        boxes = np.concatenate((box_xy, box_wh), axis=-1)

        # boxes: centroids, box_confidence: confidence level, box_class_probs:
        # class confidence
        return boxes, box_confidence, box_class_probs

    def _filter_boxes(self, boxes, box_confidences, box_class_probs):
        """Take in the unfiltered bounding box descriptors and discard each cell
        whose score is lower than the object threshold set during class initialization.

        Keyword arguments:
        boxes -- bounding box coordinates with shape (height,width,3,4); 4 for
        x,y,height,width coordinates of the boxes
        box_confidences -- bounding box confidences with shape (height,width,3,1); 1 for as
        confidence scalar per element
        box_class_probs -- class probabilities with shape (height,width,3,CATEGORY_NUM)

        """
        box_scores = box_confidences * box_class_probs
        box_classes = np.argmax(box_scores, axis=-1)
        box_class_scores = np.max(box_scores, axis=-1)
        pos = np.where(box_class_scores >= self.object_threshold)

        boxes = boxes[pos]
        classes = box_classes[pos]
        scores = box_class_scores[pos]

        return boxes, classes, scores

    def _nms_boxes(self, boxes, box_confidences):
        """Apply the Non-Maximum Suppression (NMS) algorithm on the bounding boxes with their
        confidence scores and return an array with the indexes of the bounding boxes we want to
        keep (and display later).

        Keyword arguments:
        boxes -- a NumPy array containing N bounding-box coordinates that survived filtering,
        with shape (N,4); 4 for x,y,height,width coordinates of the boxes
        box_confidences -- a Numpy array containing the corresponding confidences with shape N
        """
        x_coord = boxes[:, 0]
        y_coord = boxes[:, 1]
        width = boxes[:, 2]
        height = boxes[:, 3]

        areas = width * height
        ordered = box_confidences.argsort()[::-1]

        keep = list()
        while ordered.size > 0:
            # Index of the current element:
            i = ordered[0]
            keep.append(i)
            xx1 = np.maximum(x_coord[i], x_coord[ordered[1:]])
            yy1 = np.maximum(y_coord[i], y_coord[ordered[1:]])
            xx2 = np.minimum(x_coord[i] + width[i], x_coord[ordered[1:]] + width[ordered[1:]])
            yy2 = np.minimum(y_coord[i] + height[i], y_coord[ordered[1:]] + height[ordered[1:]])

            width1 = np.maximum(0.0, xx2 - xx1 + 1)
            height1 = np.maximum(0.0, yy2 - yy1 + 1)
            intersection = width1 * height1
            union = (areas[i] + areas[ordered[1:]] - intersection)

            # Compute the Intersection over Union (IoU) score:
            iou = intersection / union

            # The goal of the NMS algorithm is to reduce the number of adjacent bounding-box
            # candidates to a minimum. In this step, we keep only those elements whose overlap
            # with the current bounding box is lower than the threshold:
            indexes = np.where(iou <= self.nms_threshold)[0]
            ordered = ordered[indexes + 1]

        keep = np.array(keep)
        return keep

推理部分代码：

trt_inference.py

# -*-coding: utf-8-*-
# author: hxy
"""
Inference:
yolov3-tiny.trt
"""

import time
import cv2
from lib import common
import tensorrt as trt
from lib.data_processing import PreprocessYOLO, PostprocessYOLO, VideoPreprocessYOLO

TRT_LOGGER = trt.Logger()


# def load_model(trt_file_path):
#     with open(trt_file_path, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
#         print("Loading engine from: {}".format(trt_file_path.split('/')[-1]))
#         return runtime.deserialize_cuda_engine(f.read())

# Inference on Yolov3-tiny
class InferenceYolov3tiny(object):
    def __init__(self, engine):
        self.engine = engine
        # self.img_path = test_img_path
        self.input_size = (416, 416)
        self.postprocess_args = {"yolo_masks": [(3, 4, 5), (0, 1, 2)],
                                 "yolo_anchors": [(10, 14), (23, 27), (37, 58), (81, 82), (135, 169), (344, 319)],
                                 "obj_threshold": 0.5,
                                 "nms_threshold": 0.35,
                                 "yolo_input_resolution": (416, 416)}
        self.output_shape_416 = [(1, 24, 13, 13), (1, 24, 26, 26)]
        self.output_shape_480 = [(1, 24, 15, 15), (1, 24, 30, 30)]
        self.output_shape_544 = [(1, 24, 17, 17), (1, 24, 34, 34)]
        self.output_shape_608 = [(1, 24, 19, 19), (1, 24, 38, 38)]
        print("Image input size:{}".format(self.input_size))

    def preprocess_img(self, img, context):
        # preprocessor = PreprocessYOLO(self.input_size) # 照片处理函数
        preprocessor = VideoPreprocessYOLO(self.input_size)  # 视频流处理函数
        image_raw, image = preprocessor.process(img)
        ori_img_hw = image_raw.size
        inputs, outputs, bindings, stream = common.allocate_buffers(self.engine)
        inputs[0].host = image
        s = time.time()
        trt_outputs = common.do_inference(context=context,
                                          bindings=bindings,
                                          inputs=inputs,
                                          outputs=outputs,
                                          stream=stream)
        ts = time.time() - s
        trt_outputs = [output.reshape(shape) for output, shape in zip(trt_outputs, self.output_shape_416)]
        postprocessor = PostprocessYOLO(**self.postprocess_args)
        boxes, classes, scores = postprocessor.process(trt_outputs, ori_img_hw)
        print("Inferences cost: %.3f ms" % ((time.time() - s) * 1000))
        if boxes is None:
            return image_raw, [], [], [], ts
        else:
            return image_raw, boxes, classes, scores, ts

这部分代码呢，自己修改的咯，中间ts这个参数，计算的是模型前向推理的时间，便于后面计算FPS

三. 使用tensorrt加速模型推理（视频流）

直接上代码：trt_video_inference.py

# -*-coding: utf-8-*-
# author: hxy
"""
使用tensorrt进行视频流的inference
"""

import cv2
import time
import logging
from PIL import Image
import numpy as np
import tensorrt as trt
from lib.trt_inference import InferenceYolov3tiny

TRT_LOGGER = trt.Logger()


# logging set
def log_set():
    logging.basicConfig(level=logging.INFO,
                        format='%(asctime)s - %(levelname)s - %(message)s')


# 根据预测的id获取类别名
def get_names(names_file, classes_id):
    with open(names_file, 'r') as f:
        name = f.read()
    names = name.splitlines()
    f.close()
    return names[classes_id]


# 加载model
def load_model(trt_file_path):
    with open(trt_file_path, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
        print("Loading engine: {}!".format(trt_file_path.split('/')[-1]))
        return runtime.deserialize_cuda_engine(f.read())


# 获取视频流并进行推理
def video_inference(rtsp, engine, context):
    logging.info("获取视频流rtsp地址:{}".format(rtsp))
    cap = cv2.VideoCapture('test.mp4')
    fourcc = cv2.VideoWriter_fourcc(*'XVID')
    size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))
    out = cv2.VideoWriter('result_trt_inference.avi', fourcc, 10.0, size)
    num = 0
    while 1:
        _, frame = cap.read()
        start = time.time()
        frame = Image.fromarray(frame, mode="RGB")
        image_raw, boxes, classes_id, scores, ts = engine.preprocess_img(img=frame, context=context)
        num += 1
        print(ts)
        fps = 1 / ts
        img = cv2.cvtColor(np.asarray(image_raw), cv2.COLOR_RGB2BGR)
        for id, box, scores in zip(classes_id, boxes, scores):
            name = get_names(names_file='names.txt',
                             classes_id=id)
            logging.info("{}:{}%".format(name, int(scores * 100)))
            x_coord, y_coord, width, height = box
            left = max(0, np.floor(x_coord + 0.5).astype(int))
            top = max(0, np.floor(y_coord + 0.5).astype(int))
            right = min(image_raw.width, np.floor(x_coord + width + 0.5).astype(int))
            bottom = min(image_raw.height, np.floor(y_coord + height + 0.5).astype(int))
            cv2.rectangle(img, (left - 4, top - 4), (right + 4, bottom + 4), (255, 0, 255), 1)
            cv2.putText(img, name, (left - 5, top - 8), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 1)
            cv2.putText(img, str('FPS: {%.3f}' % fps), (50, 100), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 255, 0), 2)
            img = img[:, :, [2, 1, 0]]
            out.write(img)
        img = cv2.resize(img, None, fx=.5, fy=.6)
        # img = img[:, :, [2, 1, 0]]
        cv2.imshow("Inference-Results", img)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    cap.release()
    cv2.destroyAllWindows()


if __name__ == '__main__':
    log_set()
    with load_model('yolov3-tiny.trt') as engine, engine.create_execution_context() as context:
        inference_engine = InferenceYolov3tiny(engine=engine)
        # inference_engine = InferenceYolov3(engine=engine)
        video_inference(rtsp=" ",
                        engine=inference_engine,
                        context=context)

这就是推理代码，里面你需要注意一个地方就是关于names.txt文件，由于本人测试的时候用的模型是只简单训练了一个三类的检测模型，所以呢，我是按照这个来的。这里贴一下本人的names.txt文件内容；

car
pedestrian
face

好了，代码有了，那么我们可以开始进行推理了，经过我的软磨硬泡，我朋友的自拍视屏再次被我拉上用场。

python trt_video_inference.py

执行完后，测试视屏的检测结果会输出保存成文件，便于后期查看，至于结果，同样的，我还是截图给大家瞅瞅（朋友再次露脸）

这里呢，这个FPS400多一出来，我都有些不信，但是，代码是自己写的吧，不信也得信呀，不带任何骚操作的推理过程对于一张图片而言确实很快，2-3ms左右。。。对比起原版或者onnx而言，快了好几倍吧。看样子tensorrt的加速效果还是可以的吧。

总结

本篇博客采用tensorrt来加速yolov3-tiny网络框架的推理，从用来测试的mp4文件上的fps以及单张图片的前向推理时间来看，很快，很猛！
当然，并不是所有的代码都是自己写的，参考加学习！！
所以，文中有不足支出，还请多多谅解！要是有内容描述不对的还请指出，以便共同学习，进步！！
希望本博客对您有用！！

你可能感兴趣的:(Project,深度学习,神经网络,tensorrt)

NumPy-@运算符详解 GG不是gg numpy numpy
NumPy-@运算符详解一、@运算符的起源与设计目标1.从数学到代码：符号的统一2.设计目标二、@运算符的核心语法与运算规则1.基础用法：二维矩阵乘法2.一维向量的矩阵语义3.高维数组：批次矩阵运算4.广播机制：灵活的形状匹配三、@运算符与其他乘法方式的核心区别1.对比`np.dot()`2.对比元素级乘法`*`3.对比`np.matrix`的`*`运算符四、典型应用场景：从基础到高阶1.深度学习
NLP_知识图谱_大模型——个人学习记录 macken9999 自然语言处理知识图谱大模型自然语言处理知识图谱学习
1.自然语言处理、知识图谱、对话系统三大技术研究与应用https://github.com/lihanghang/NLP-Knowledge-Graph深度学习-自然语言处理(NLP)-知识图谱：知识图谱构建流程【本体构建、知识抽取（实体抽取、关系抽取、属性抽取）、知识表示、知识融合、知识存储】-元気森林-博客园https://www.cnblogs.com/-402/p/16529422.htm
解决 Python 包安装失败问题：以 accelerate 为例
在使用Python开发项目时，我们经常会遇到依赖包安装失败的问题。今天，我们就以accelerate包为例，详细探讨一下可能的原因以及解决方法。通过这篇文章，你将了解到Python包安装失败的常见原因、如何切换镜像源、如何手动安装包，以及一些实用的注意事项。一、问题背景在开发一个深度学习项目时，我需要安装accelerate包来优化模型的训练过程。然而，当我运行以下命令时：bash复制pipins
图神经网络：挖掘关系数据中的宝藏
图神经网络：挖掘关系数据中的宝藏在浩瀚的数据海洋中，蕴藏着一类特殊而强大的资源——关系数据。它们不是孤立的点，而是相互连接、彼此影响的复杂网络：社交平台上朋友的朋友、电商系统中商品与用户的互动、蛋白质分子内原子的结合、城市交通网中的道路连接……这些数据天然以图的形式存在，节点代表实体，边则承载着实体间千丝万缕的关系。传统的数据挖掘工具面对这些盘根错节的结构往往力不从心，而图神经网络（GNN）的崛起
从RNN循环神经网络到Transformer注意力机制：解析神经网络架构的华丽蜕变熊猫钓鱼>_> 神经网络 rnn transformer
1.引言在自然语言处理和序列建模领域，神经网络架构经历了显著的演变。从早期的循环神经网络（RNN）到现代的Transformer架构，这一演变代表了深度学习方法在处理序列数据方面的重大进步。本文将深入比较这两种架构，分析它们的工作原理、优缺点，并通过实验结果展示它们在实际应用中的性能差异。2.循环神经网络（RNN）2.1基本原理循环神经网络是专门为处理序列数据而设计的神经网络架构。RNN的核心思想
WPF学习笔记（2）——x名称空间详解上幽冥宇少 WPF C#WPF学习笔记初学者 C#VS2013
先说一些基本的，.NET的模块称为程序集（Assembly）。一般情况下，用VS创建的是解决方案（Solution），一个解决方案就是一个完整的程序。解决方案中包含若干个项目（Project），每个项目是可以独立编译的，他的编译结果是一个程序集。常见的程序集是以.exe为扩展名的可执行程序或者是以.dll为扩展名的动态链接库，大多数情况下，我们说“引用其他程序集”的时候，说的是动态链接库。因为.N
如何使用Python实现交通工具识别
如何使用Python实现交通工具识别文章目录技术架构功能流程识别逻辑用户界面增强特性依赖项主要类别内容展示该系统是一个基于深度学习的交通工具识别工具，具备以下核心功能与特点：技术架构使用预训练的ResNet50卷积神经网络模型（来自ImageNet数据集）集成图像增强预处理技术（随机裁剪、旋转、翻转等）采用多数投票机制提升预测稳定性基于置信度评分的结果筛选策略功能流程用户通过GUI界面选择待识别图
【EGSR2025】材质+扩散模型+神经网络相关论文整理随笔（四） Superstarimage 文献随笔材质神经网络人工智能扩散模型
AnevaluationofSVBRDFPredictionfromGenerativeImageModelsforAppearanceModelingof3DScenes输入3D场景的几何和一张参考图像，通过扩散模型和SVBRDF预测器获取多视角的材质maps，这些maps最终合并成场景的纹理地图集，并支持在任意视角、任意光照条件下进行重新渲染。样例图如下：在当前时代的技术背景下，生成与几何匹配
使用Dockerfile构建含私有Maven仓库依赖包的Java容器
背景需要用JDBC方式访问ArgoDB星环提供了ArgoDBjar包应用将以Container的方式运行我希望打包成镜像之后，镜像启动就能测试连接是否成功连接URL串需要能够传递进去失败的方案一：本地文件导入POMpom.xml配置本地路径com.transwarpinceptor-driver8.31.2system${project.basedir}/lib/inceptor-driver-8
Python OpenCV教程从入门到精通的全面指南【文末送书】一键难忘 python opencv 开发语言
文章目录PythonOpenCV从入门到精通1.安装OpenCV2.基本操作2.1读取和显示图像2.2图像基本操作3.图像处理3.1图像转换3.2图像阈值处理3.3图像平滑4.边缘检测和轮廓4.1Canny边缘检测4.2轮廓检测5.高级操作5.1特征检测5.2目标跟踪5.3深度学习与OpenCVPythonOpenCV从入门到精通【文末送书】PythonOpenCV从入门到精通OpenCV(Ope
CNN 猫狗识别：从理论到实战的深度解析爱熬夜的小古 cnn 深度学习人工智能
在计算机视觉领域，卷积神经网络（ConvolutionalNeuralNetwork，CNN）凭借其强大的特征提取和模式识别能力，成为图像分类任务的主流技术。猫狗识别作为经典的图像分类问题，不仅能帮助我们理解CNN的工作原理，还能为实际应用提供技术支持。本文将深入探讨CNN在猫狗识别中的应用，从理论基础到实战代码，带你全面掌握这项技术。一、CNN基础理论概述（一）CNN的核心组件卷积层：是CNN的
第八周 tensorflow实现猫狗识别降花绘 365天深度学习 tensorflow系列 tensorflow 深度学习人工智能
本文为365天深度学习训练营内部限免文章（版权归K同学啊所有）**参考文章地址：[TensorFlow入门实战｜365天深度学习训练营-第8周：猫狗识别（训练营内部成员可读）]**作者：K同学啊文章目录一、本周学习内容:1、自己搭建VGG16网络2、了解model.train_on_batch（）3、了解tqdm，并使用tqdm实现可视化进度条二、前言三、电脑环境四、前期准备1、导入相关依赖项2、
深度学习实战-使用TensorFlow与Keras构建智能模型程序员Gloria Python超入门 TensorFlow python
深度学习实战-使用TensorFlow与Keras构建智能模型深度学习已经成为现代人工智能的重要组成部分，而Python则是实现深度学习的主要编程语言之一。本文将探讨如何使用TensorFlow和Keras构建深度学习模型，包括必要的代码实例和详细的解析。1.深度学习简介深度学习是机器学习的一个分支，使用多层神经网络来学习和表示数据中的复杂模式。其广泛应用于图像识别、自然语言处理、推荐系统等领域。
AI在垂直领域的深度应用：医疗、金融与自动驾驶的革新之路
AI在垂直领域的深度应用：医疗、金融与自动驾驶的革新之路一、医疗领域：AI驱动的精准诊疗与效率提升1.医学影像诊断AI算法通过深度学习技术，已实现对X光、CT、MRI等影像的快速分析，辅助医生检测癌症、骨折等疾病。例如，GoogleDeepMind的AI系统在乳腺癌筛查中，误检率比人类专家低9.4%；中国的推想医疗AI系统可在20秒内完成肺部CT扫描分析，为急诊救治争取黄金时间。2.药物研发传统药
SPGAN: Siamese projection Generative Adversarial Networks 这张生成的图像能检测吗优质GAN模型训练自己的数据集人工智能生成对抗网络计算机视觉深度学习神经网络算法
简介简介：该论文针对传统GANs中鉴别器采用硬边际分类导致的误分类问题，提出了基于Siameseprojection网络的SPGAN方法。主要创新点包括：（1）设计Siameseprojection网络来测量特征相似性；（2）提出相似特征对抗学习框架，将相似性测量融入生成器和鉴别器的损失函数；（3）通过相似特征对抗学习，鉴别器能最大化真实图像和生成图像特征的差异性，生成器能合成包含更多真实图像特征
专题：2025云计算与AI技术研究趋势报告|附200+份报告PDF、原数据表汇总下载
原文链接：https://tecdat.cn/?p=42935关键词：2025,云计算，AI技术，市场趋势，深度学习，公有云，研究报告云计算和AI技术正以肉眼可见的速度重塑商业世界。过去十年，全球云服务收入激增8倍，中国云计算市场规模突破6000亿元，而深度学习算法的应用量更是暴涨400倍。这些数字背后，是企业从“自建机房”到“云原生开发”的转型，是AI从“实验室”走向“产业级应用”的跨越。本报告
【深度学习解惑】在实践中如何发现和修正RNN训练过程中的数值不稳定？云博士的AI课堂大模型技术开发与实践哈佛博后带你玩转机器学习深度学习深度学习 rnn 人工智能 tensorflow pytorch 神经网络机器学习
在实践中发现和修正RNN训练过程中的数值不稳定目录引言与背景介绍原理解释代码说明与实现应用场景与案例分析实验设计与结果分析性能分析与技术对比常见问题与解决方案创新性与差异性说明局限性与挑战未来建议和进一步研究扩展阅读与资源推荐图示与交互性内容语言风格与通俗化表达互动交流1.引言与背景介绍循环神经网络(RNN)在处理序列数据时表现出色，但训练过程中常面临梯度消失和梯度爆炸问题，导致数值不稳定。当网络
【深度学习实战】当前三个最佳图像分类模型的代码详解云博士的AI课堂大模型技术开发与实践哈佛博后带你玩转机器学习深度学习深度学习人工智能分类模型机器学习 Transformer EfficientNet ConvNeXt
下面给出三个在当前图像分类任务中精度表现突出的模型示例，分别基于SwinTransformer、EfficientNet与ConvNeXt。每个模型均包含：训练代码（使用PyTorch）从预训练权重开始微调（也可注释掉预训练选项，从头训练）数据集目录结构：└──dataset_root├──buy#第一类图像└──nobuy#第二类图像随机拆分：80%训练，20%验证每个Epoch输出一次loss
第35周—————糖尿病预测模型优化探索
目录目录前言1.检查GPU2.查看数据编辑3.划分数据集4.创建模型与编译训练5.编译及训练模型6.结果可视化7.总结前言本文为365天深度学习训练营中的学习记录博客原作者：K同学啊1.检查GPUimporttorch.nnasnnimporttorch.nn.functionalasFimporttorchvision,torch#设置硬件设备，如果有GPU则使用，没有则使用cpudevice=
在conda环境下，安装第三方库出现：You must use Visual Studio to build a python extension on windows codeの诱惑 conda visual studio python
解决办法：在电脑上安装MicrosoftC++生成工具https://visualstudio.microsoft.com/zh-hans/visual-cpp-build-tools/安装后，重新执行一下pip的安装命令即可pipinstallface_recognition完整错误如下：(Runtool)PSF:\xgs\Python\project\RunTool\face_diff>pip
《从依赖纠缠到接口协作：ASP.NET Core注入式开发指南》后端
在C#的ASP.NETCore开发中，依赖注入绝非简单的技术技巧，而是重构代码关系的底层逻辑。它像一套隐形的神经网络，让程序模块摆脱硬编码的束缚，在运行时实现动态连接，从而为系统注入可测试、可进化的核心生命力。理解其深层价值，需要穿透"服务注册与获取"的表层操作，触及它对软件设计哲学的重塑。依赖注入的本质，是对"依赖关系"的去中心化治理。传统开发中，模块间的依赖如同藤蔓缠绕的树木，一个组件直接创建
使用 Docker 搭建 Python（Flask/CUDA AI）开发环境——AI教你学Docker
使用Docker搭建Python（Flask/CUDAAI）开发环境及常用中间件配置详解本指南适用于用Docker快速搭建Python（FlaskWeb应用或包含CUDA的AI开发环境）开发环境，并集成常用中间件服务如MySQL、Redis、Kafka。适合个人开发、本地测试和小团队协作。一、项目目录结构建议project-root/├──app/#Python应用源码目录│├──Dockerfi
unity 使用xcode5.1 launching iOS project via Xcode4 failed
unity在使用Xcode5.1时，build&run会抛出异常的，这是unity的一个bug，不过据说unity4.5会把它修好的下面有一个临时的解决方案：1.在unit的安装目录下找到:Unity.app/Contents/BuildTargetTools/iPhonePlayer/Unity4XC.xcplugin/Contents/Info.plist2.打开并找到下面的内容:DVTPlu
VR开发基础（二）一文详解Oculus环境helloxr的openxr核心接口流程起个昵称那么难 XR vr android
注：Oculus使用的是开源的hello_xr示例，但要使用自家的loader；在hello_xr上篇侧重分析了入口和图形的基本流程，此篇将侧重分析XR相关的流程一，SetupandBuildhello_xr1，下载hello_xr官方示例、Oculus的sdk，获取loaderTheOpenXR™SoftwareDevelopmentKit(SDK)SourcesProjectcontainst
在IntelliJ IDEA中如何配置使用Maven以创建Tomcat环境 leo__520 intellij-idea maven tomcat
第一步，装备出发。首先你得有IntelliJIDEA和Maven。这两样工具在网上能轻松找到，如同超市的货架上摆放着令人眼花缭乱的商品，拿起来闻闻，再看看日期，确认没问题，放进购物车，我们光速地走向第二步。第二步，装修房子。打开IntelliJIDEA，成为一个建筑师，新建一个Maven项目，这个过程就像建造一幢房子。在“File”菜单，选择“New”->“Project”，弹出窗口后，选择“Ma
深度学习预备知识 AmazingMQ 深度学习人工智能
1.Tensor张量定义：张量（tensor）表示一个由数值组成的数组，这个数组可能有多个维度（轴）。具有一个轴的张量对应数学上的向量，具有两个轴的张量对应数学上的矩阵，具有两个以上轴的张量目前没有特定的数学名称。importtorch#arange创建一个行向量x，这个行向量包含以0开始的前12个整数。x=torch.arange(12)print("x=",x)#x=tensor([0,1,2
模型实战（21）之 C++ - tensorRT部署yolov8-det 目标检测明月醉窗台 #深度学习实战例程人工智能 c++YOLO 目标检测计算机视觉人工智能
C++-tensorRT部署yolov8-det目标检测python环境下如何直接调用推理模型转换并导出：pt->onnx->.engineC++tensorrt部署检测模型不写废话了，直接上具体实现过程+all代码1.Python环境下推理直接命令行推理，巨简单yolodetectpredictmodel=yolov8n.ptsource='https
基于SIFT-POCS的超分辨率图像重建技术研究与实现神经网络15044 算法深度学习仿真模型人工智能计算机视觉深度学习算法大数据机器学习
基于SIFT-POCS的超分辨率图像重建技术研究与实现摘要本文详细研究了基于SIFT特征匹配和POCS(ProjectionOntoConvexSets)算法的超分辨率图像重建方法，并完整实现了文献"Super-ResolutionImageReconstructionBasedonSIFT-POCS"中提出的算法。首先介绍了超分辨率重建的基本原理和研究意义，然后深入分析了SIFT特征提取与匹配、
根茎式装配体（RA）作为下一代协同智能范式的理论、架构与应用由数入道人工智能思维框架软件工程智能体
一、引言——范式危机与新大陆的召唤1.1表征主义的黄昏：当前AI协同范式的认知天花板自艾伦·图灵在《计算机器与智能》中播下思想的种子以来，人工智能的漫长征途始终被一个强大而内隐的哲学范式所笼罩——我们称之为“表征主义”（Representationism）。这一范式，无论其外在形态如何演变，从早期的符号逻辑、专家系统，到如今风靡全球的深度学习神经网络，其核心信念从未动摇：智能的核心，在于构建一个关
【零基础学AI】第36讲：GPT模型原理 1989 0基础学AI 人工智能 gpt lstm rnn YOLO 目标检测
本节课你将学到理解GPT模型的基本原理掌握Transformer解码器的工作机制实现一个简单的文本生成应用开始之前环境要求Python3.8+安装包：pipinstalltransformerstorch硬件：CPU即可运行（GPU可加速）前置知识了解基本的神经网络概念（第23讲内容）熟悉Python编程基础核心概念什么是GPT？GPT（GenerativePre-trainedTransform
ASM系列六利用TreeApi 添加和移除类成员 lijingyao8206 jvm 动态代理 ASM 字节码技术 TreeAPI
同生成的做法一样，添加和移除类成员只要去修改fields和methods中的元素即可。这里我们拿一个简单的类做例子，下面这个Task类，我们来移除isNeedRemove方法，并且添加一个int 类型的addedField属性。 package asm.core; /** * Created by yunshen.ljy on 2015/6/
Springmvc-权限设计 bee1314 spring Web jsp
万丈高楼平地起。权限管理对于管理系统而言已经是标配中的标配了吧，对于我等俗人更是不能免俗。同时就目前的项目状况而言，我们还不需要那么高大上的开源的解决方案，如Spring Security，Shiro。小伙伴一致决定我们还是从基本的功能迭代起来吧。目标： 1.实现权限的管理（CRUD） 2.实现部门管理（CRUD) 3.实现人员的管理（CRUD） 4.实现部门和权限
算法竞赛入门经典（第二版）第2章习题 CrazyMizzz c 算法
2.4.1 输出技巧 #include <stdio.h> int main() { int i, n; scanf("%d", &n); for (i = 1; i <= n; i++) printf("%d\n", i); return 0; } 习题2-2 水仙花数(daffodil
struts2中jsp自动跳转到Action 麦田的设计者 jsp webxml struts2 自动跳转
1、在struts2的开发中，经常需要用户点击网页后就直接跳转到一个Action，执行Action里面的方法，利用mvc分层思想执行相应操作在界面上得到动态数据。毕竟用户不可能在地址栏里输入一个Action（不是专业人士） 2、＜jsp:forward page="xxx.action" /＞，这个标签可以实现跳转，page的路径是相对地址,不同与jsp和j
php 操作webservice实例 IT独行者 PHP webservice
首先大家要简单了解了何谓webservice，接下来就做两个非常简单的例子，webservice还是逃不开server端与client端。我测试的环境为：apache2.2.11 php5.2.10做这个测试之前，要确认你的php配置文件中已经将soap扩展打开，即extension=php_soap.dll; OK 现在我们来体验webservice //server端 serve
Windows下使用Vagrant安装linux系统 _wy_ windows vagrant
准备工作：下载安装 VirtualBox ：https://www.virtualbox.org/ 下载安装 Vagrant ：http://www.vagrantup.com/ 下载需要使用的 box ：官方提供的范例：http://files.vagrantup.com/precise32.box 还可以在 http://www.vagrantbox.es/
更改linux的文件拥有者及用户组(chown和chgrp) 无量 c linux chgrp chown
本文（转） http://blog.163.com/yanenshun@126/blog/static/128388169201203011157308/ http://ydlmlh.iteye.com/blog/1435157 一、基本使用：使用chown命令可以修改文件或目录所属的用户：命令
linux下抓包工具矮蛋蛋 linux
原文地址： http://blog.chinaunix.net/uid-23670869-id-2610683.html tcpdump -nn -vv -X udp port 8888 上面命令是抓取udp包、端口为8888 netstat -tln 命令是用来查看linux的端口使用情况 13 . 列出所有的网络连接 lsof -i 14. 列出所有tcp 网络连接信息 l
我觉得mybatis是垃圾！：“每一个用mybatis的男纸，你伤不起” alafqq mybatis
最近看了每一个用mybatis的男纸，你伤不起原文地址：http://www.iteye.com/topic/1073938 发表一下个人看法。欢迎大神拍砖；个人一直使用的是Ibatis框架，公司对其进行过小小的改良；最近换了公司，要使用新的框架。听说mybatis不错；就对其进行了部分的研究；发现多了一个mapper层；个人感觉就是个dao；
解决java数据交换之谜百合不是茶数据交换
交换两个数字的方法有以下三种，其中第一种最常用 /* 输出最小的一个数 */ public class jiaohuan1 { public static void main(String[] args) { int a =4; int b = 3; if(a<b){ // 第一种交换方式 int tmep =
渐变显示 bijian1013 JavaScript
<style type="text/css"> #wxf { FILTER: progid:DXImageTransform.Microsoft.Gradient(GradientType=0, StartColorStr=#ffffff, EndColorStr=#97FF98); height: 25px; } </style>
探索JUnit4扩展：断言语法assertThat bijian1013 java 单元测试 assertThat
一.概述 JUnit 设计的目的就是有效地抓住编程人员写代码的意图，然后快速检查他们的代码是否与他们的意图相匹配。 JUnit 发展至今，版本不停的翻新，但是所有版本都一致致力于解决一个问题，那就是如何发现编程人员的代码意图，并且如何使得编程人员更加容易地表达他们的代码意图。JUnit 4.4 也是为了如何能够
【Gson三】Gson解析{"data":{"IM":["MSN","QQ","Gtalk"]}} bit1129 gson
如何把如下简单的JSON字符串反序列化为Java的POJO对象? {"data":{"IM":["MSN","QQ","Gtalk"]}} 下面的POJO类Model无法完成正确的解析： import com.google.gson.Gson;
【Kafka九】Kafka High Level API vs. Low Level API bit1129 kafka
1. Kafka提供了两种Consumer API High Level Consumer API Low Level Consumer API(Kafka诡异的称之为Simple Consumer API，实际上非常复杂) 在选用哪种Consumer API时，首先要弄清楚这两种API的工作原理，能做什么不能做什么，能做的话怎么做的以及用的时候，有哪些可能的问题
在nginx中集成lua脚本：添加自定义Http头，封IP等 ronin47 nginx lua
Lua是一个可以嵌入到Nginx配置文件中的动态脚本语言，从而可以在Nginx请求处理的任何阶段执行各种Lua代码。刚开始我们只是用Lua 把请求路由到后端服务器，但是它对我们架构的作用超出了我们的预期。下面就讲讲我们所做的工作。强制搜索引擎只索引mixlr.com Google把子域名当作完全独立的网站，我们不希望爬虫抓取子域名的页面，降低我们的Page rank。 location /{
java-归并排序 bylijinnan java
import java.util.Arrays; public class MergeSort { public static void main(String[] args) { int[] a={20,1,3,8,5,9,4,25}; mergeSort(a,0,a.length-1); System.out.println(Arrays.to
Netty源码学习-CompositeChannelBuffer bylijinnan java netty
CompositeChannelBuffer体现了Netty的“Transparent Zero Copy” 查看API（ http://docs.jboss.org/netty/3.2/api/org/jboss/netty/buffer/package-summary.html#package_description）可以看到，所谓“Transparent Zero Copy”是通
Android中给Activity添加返回键 hotsunshine Activity
// this need android:minSdkVersion="11" getActionBar().setDisplayHomeAsUpEnabled(true); @Override public boolean onOptionsItemSelected(MenuItem item) {
静态页面传参 ctrain 静态
$(document).ready(function () { var request = { QueryString : function (val) { var uri = window.location.search; var re = new RegExp("" + val + "=([^&?]*)", &
Windows中查找某个目录下的所有文件中包含某个字符串的命令 daizj windows 查找某个目录下的所有文件包含某个字符串
findstr可以完成这个工作。 [html] view plain copy >findstr /s /i "string" *.* 上面的命令表示，当前目录以及当前目录的所有子目录下的所有文件中查找"string&qu
改善程序代码质量的一些技巧 dcj3sjt126com 编程 PHP 重构
有很多理由都能说明为什么我们应该写出清晰、可读性好的程序。最重要的一点，程序你只写一次，但以后会无数次的阅读。当你第二天回头来看你的代码时，你就要开始阅读它了。当你把代码拿给其他人看时，他必须阅读你的代码。因此，在编写时多花一点时间，你会在阅读它时节省大量的时间。让我们看一些基本的编程技巧：尽量保持方法简短尽管很多人都遵
SharedPreferences对数据的存储 dcj3sjt126com
SharedPreferences简介： &nbs
linux复习笔记之bash shell (2) bash基础 eksliang bash bash shell
转载请出自出处： http://eksliang.iteye.com/blog/2104329 1.影响显示结果的语系变量（locale） 1.1locale这个命令就是查看当前系统支持多少种语系，命令使用如下： [root@localhost shell]# locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8"
Android零碎知识总结 gqdy365 android
1、CopyOnWriteArrayList add(E) 和remove(int index)都是对新的数组进行修改和新增。所以在多线程操作时不会出现java.util.ConcurrentModificationException错误。所以最后得出结论：CopyOnWriteArrayList适合使用在读操作远远大于写操作的场景里，比如缓存。发生修改时候做copy，新老版本分离，保证读的高
HoverTree.Model.ArticleSelect类的作用 hvt Web .net C#hovertree asp.net
ArticleSelect类在命名空间HoverTree.Model中可以认为是文章查询条件类，用于存放查询文章时的条件，例如HvtId就是文章的id。HvtIsShow就是文章的显示属性，当为-1是，该条件不产生作用，当为0时，查询不公开显示的文章，当为1时查询公开显示的文章。HvtIsHome则为是否在首页显示。HoverTree系统源码完全开放，开发环境为Visual Studio 2013
PHP 判断是否使用代理 PHP Proxy Detector 天梯梦 proxy
1. php 类 I found this class looking for something else actually but I remembered I needed some while ago something similar and I never found one. I'm sure it will help a lot of developers who try to
apache的math库中的回归——regression（翻译） lvdccyb Math apache
这个Math库，虽然不向weka那样专业的ML库，但是用户友好，易用。多元线性回归，协方差和相关性（皮尔逊和斯皮尔曼），分布测试（假设检验，t，卡方，G），统计。数学库中还包含，Cholesky，LU，SVD，QR，特征根分解，真不错。基本覆盖了：线代，统计，矩阵，最优化理论曲线拟合常微分方程遗传算法（GA），还有3维的运算。。。
基础数据结构和算法十三：Undirected Graphs (2) sunwinner Algorithm
Design pattern for graph processing. Since we consider a large number of graph-processing algorithms, our initial design goal is to decouple our implementations from the graph representation
云计算平台最重要的五项技术 sumapp 云计算云平台智城云
云计算平台最重要的五项技术 1、云服务器云服务器提供简单高效，处理能力可弹性伸缩的计算服务，支持国内领先的云计算技术和大规模分布存储技术，使您的系统更稳定、数据更安全、传输更快速、部署更灵活。特性机型丰富通过高性能服务器虚拟化为云服务器，提供丰富配置类型虚拟机，极大简化数据存储、数据库搭建、web服务器搭建等工作；仅需要几分钟，根据CP
《京东技术解密》有奖试读获奖名单公布 ITeye管理员活动
ITeye携手博文视点举办的12月技术图书有奖试读活动已圆满结束，非常感谢广大用户对本次活动的关注与参与。 12月试读活动回顾： http://webmaster.iteye.com/blog/2164754 本次技术图书试读活动获奖名单及相应作品如下：一等奖（两名） Microhardest：http://microhardest.ite