使用pytorch的C++API部署图像分类模型

   在部署pytorch模型时候,使用C++ API能有更高的效率,本文记录使用C++ API部署一个图像分类模型的过程。

1.模型转换

   首先需要将pytorch模型转化为Torch Script,Torch Script是PyTorch模型的一种表示,可以被Torch Script编译器理解,编译和序列化。用torch script把torch模型转成c++接口可读的模型有两种方式:Tracing && Annotation. tracing比Annotation简单,但只适合结构固定的网络模型,即forward中没有控制流的情况,因为Tracing只会保存运行时实际走的路径。如果forward函数中有控制流,需要用Annotation方式实现。
本文采用Tracing的方式进行模型转换,tracing顾名思义,就是沿着数据运算的路径走一遍。

import torch
model = torch.load('./weights/best_resnet.pkl', map_location="cuda:0")
model.cuda()
# 使用 torch.jit.trace 生成 torch.jit.ScriptModule 来跟踪
x = torch.rand(1, 3, 224, 224)
x = x.cuda()  # very important
traced_script_module = torch.jit.trace(model, x)

2.Script Module序列化

   将ScriptModule序列化后才可以在c++中顺利的读取模型,而且在这个过程中不需要任何python依赖。

traced_script_module.save("resnet.pt")

   得到的 .pt文件即转换后的模型文件,可以直接在C++环境中使用,不用依赖于任何python环境。

3.c++中加载Script Module

   使用 torch::jit::load()加载模型。

#include  // One-stop header.
#include 
#include 

int main(int argc, const char* argv[]) {
    if (argc != 2) {
      std::cerr << "usage: example-app \n";
      return -1;
    }
    // Deserialize the ScriptModule from a file using torch::jit::load().
    std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(argv[1]);
    assert(module != nullptr);
    std::cout << "ok\n";
  }

4.完整的预测示例

#include 
#include "torch/script.h"
#include "torch/torch.h"
#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include 
#include 
#include 
#include 
#include
#define CLK_TCK 18.2
using namespace std;
using namespace cv;


int main()
{
   // load model
    torch::DeviceType device_type;
    device_type = torch::kCPU;
    if (torch::cuda::is_available())
    {
        device_type = torch::kCUDA;
    }
    else
    {
        device_type = torch::kCPU;
    }
    torch::Device device(device_type);
    torch::jit::script::Module module = torch::jit::load("./resnet.pt");
    module.to(device);
    std::cout<<"load model success"<<std::endl;
    double time0=static_cast<double>(getTickCount());
    for (int k=0; k<1000; k++){
        Mat img = imread("xxxxxx/video_down/wangzhe/240.jpg");
        int img_size = 224;
        Mat img_resized = img.clone();
        resize(img, img_resized,Size(img_size, img_size));
        Mat img_float;
        img_resized.convertTo(img_float, CV_32F, 1.0f / 255.0f);   //归一化到[0,1]区间
        auto tensor_image = torch::from_blob(img_float.data, {1, img_size, img_size, 3}, torch::kFloat32);  //对于一张图而言可使用此函数将nhwc格式转换成tensor
        tensor_image = tensor_image.permute({0, 3, 1, 2});//调整通道顺序,将nhwc转换成nchw
        tensor_image[0][0] = tensor_image[0][0].sub_(0.485).div_(0.229);
        tensor_image[0][1] = tensor_image[0][1].sub_(0.456).div_(0.224);
        tensor_image[0][2] = tensor_image[0][2].sub_(0.406).div_(0.225);
        tensor_image = tensor_image.to(at::kCUDA);   //将tensor放进GPU中处理
        torch::Tensor out_tensor = module.forward({tensor_image}).toTensor();  //前向计算
        auto results = out_tensor.sort(-1, true);
        auto softmaxs = std::get<0>(results)[0].softmax(0);
        auto indexs = std::get<1>(results)[0];
        auto idx = indexs[0].item<int>();
        string labels[2] = {"normal", "pk"};    
        string label = labels[idx];
        float confidence = softmaxs[0].item<float>() * 100.0f;
        cout<<"label:"<<label<<"   confidence:"<<confidence<<endl;
        }
    time0=((double)getTickCount()-time0)/getTickFrequency();
    cout << "time consume: " << time0 << endl;
    return 0;
}

和使用python进行预测相比,比较麻烦一点的是需要自己进行数据的预处理,处理方式要和训练时候保持一致,而在python预测中只需要调用transform类就可以进行处理了。

5. 编写CMakeLists.txt

cmake_minimum_required(VERSION 3.2 FATAL_ERROR)
project(Classify_cpp)

# 设置Opencv的CMake路径
set(OpenCV_DIR /usr/local/share/OpenCV)
find_package (OpenCV REQUIRED NO_CMAKE_FIND_ROOT_PATH)
if(OpenCV_FOUND)
    INCLUDE_DIRECTORIES(${OpenCV_INCLUDE_DIRS})
    message(STATUS "OpenCV library status:")
    message(STATUS "    version: ${OpenCV_VERSION}")
    message(STATUS "    libraries: ${OpenCV_LIBS}")
    message(STATUS "    include path: ${OpenCV_INCLUDE_DIRS}")
endif()
set(CMAKE_PREFIX_PATH xxx/anaconda3/envs/pytorch2/lib/python3.6/site-packages/torch)
find_package(Torch REQUIRED)

#设置编译器版本
SET(CMAKE_C_COMPILER g++)
if(CMAKE_COMPILER_IS_GNUCXX)
    add_compile_options(-std=c++11 -fno-stack-protector) # very important key in TK1,otherwise will raise an error call stack smashing detected
    message(STATUS "optional:-std=c++11")
endif(CMAKE_COMPILER_IS_GNUCXX)

add_executable(${PROJECT_NAME} classify.cpp)
target_link_libraries(${PROJECT_NAME} ${TORCH_LIBRARIES} ${OpenCV_LIBS})
SET(CMAKE_BUILD_TYPE DEBUG)

6. 编译&执行

mkdir build
cmake ..
make

或者也可以在QT中打开CMakeLists.txt文件,以便于自己调试、修改代码。特别注意的是CMakeLists.txt文件中最后添加一句"SET(CMAKE_BUILD_TYPE DEBUG)",才能够在QT中进行代码调试。

7.测试效果

   这里使用ResNet50做一个二分类计算1000张图像,使用Python API耗时18秒,C++ API耗时15秒,能够节省15%的时间消耗,更利于模型的部署。

github: Image_Classify_cpp

你可能感兴趣的:(使用pytorch的C++API部署图像分类模型)