ONNX Runtime
是将 ONNX 模型部署到生产环境的跨平台高性能运行引擎,主要对模型图应用了大量的图优化,然后基于可用的特定于硬件的加速器将其划分为子图(并行处理)。
ONNX的官方网站:https://onnx.ai/
ONXX的GitHub地址:https://github.com/onnx/onnx
C++ 库,用于加速 NVIDIA 的 GPU,可以为深度学习应用提供低延迟、高吞吐率的部署推理,支持 TensorFlow,Pytorch,Caffe2 ,Paddle等框架训练出的神经网络,可以优化网络计算TensorRT官网下载地址:https://developer.nvidia.com/zh-cn/tensorrt
开发者指南:https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html
Github地址:https://github.com/NVIDIA/TensorRT
Tensorrt 优点:在GPU上推理速度是最快的;缺点:不同显卡cuda版本可能存在不适用情况;
ONNX Runtime优点:通用性好,速度较快,适合各个平台复制;
源码地址:https://github.com/itsnine/yolov5-onnxruntime
C++ YOLO v5 ONNX Runtime inference code for object detection.
包含目录:D:\onnxruntime-win-x64-gpu-1.9.0\include
引用目录:D:\onnxruntime-win-x64-gpu-1.9.0\lib
链接器输入:
onnxruntime.lib
onnxruntime_providers_cuda.lib
onnxruntime_providers_shared.lib
在GitHub - ultralytics/yolov5: YOLOv5 in PyTorch > ONNX > CoreML > TFLite下:
python export.py --weights weights/yolov5s.pt --include onnx --device 0
#include
#include
#include "cmdline.h"
#include "utils.h"
#include "detector.h"
int main(int argc, char* argv[])
{
const float confThreshold = 0.3f;
const float iouThreshold = 0.4f;
bool isGPU = true;
const std::string classNamesPath = "coco.names";
const std::vector classNames = utils::loadNames(classNamesPath);
const std::string imagePath = "bus.jpg";
const std::string modelPath = "yolov5s.onnx";
if (classNames.empty())
{
std::cerr << "Error: Empty class names file." << std::endl;
return -1;
}
YOLODetector detector {nullptr};
cv::Mat image;
std::vector result;
try
{
detector = YOLODetector(modelPath, isGPU, cv::Size(640, 640));
std::cout << "Model was initialized." << std::endl;
image = cv::imread(imagePath);
result = detector.detect(image, confThreshold, iouThreshold);
}
catch(const std::exception& e)
{
std::cerr << e.what() << std::endl;
return -1;
}
utils::visualizeDetection(image, result, classNames);
cv::imshow("result", image);
// cv::imwrite("result.jpg", image);
cv::waitKey(0);
return 0;
}