Framework | Tensorrt Inference Server

Catalogue

  • Download
    • 1.1 download the trtis docker image from domestic images source
    • 1.2 download the trtis docker image from nvidia
  • Usage
    • 1.1 quick look
    • 1.2 deploy
    • 1.3 inspect
  • Example
    • 1.1 build a yolov3 model
      • 1.1.1 build in a docker image
      • 1.1.2 build in TensorRT 5.1.5.0 (python3)
      • 1.1.3 write a configuration file
    • 1.2 build a python client
  • Reference

Download

1.1 download the trtis docker image from domestic images source

  • more detail can be seen in the blog.1
docker pull registry.cn-beijing.aliyuncs.com/cloudhjc/tensorrtserver:server19.08
# or
docker pull registry.cn-hangzhou.aliyuncs.com/bostenai/tensorrtserver:19.04-py3

1.2 download the trtis docker image from nvidia

  • more detail can be seen in this blog.2
docker pull nvcr.io/nvidia/tensorrtserver:18.09-py3

Usage

1.1 quick look

nvidia-docker run -it --rm nvcr.io/nvidia/tensorrtserver:x.x-py3

1.2 deploy

nvidia-docker run --rm -p8000:8000 -p8001:8001 -v/path/to/examples/models:/models nvcr.io/nvidia/tensorrtserver:x.x-py3 trtserver --model-store=/models

1.3 inspect

  • host ip:port/api/status, Warming you can inspect the states of models, such as ready_state, and if ready_state==MODEL_UNAVAILABLE, trtis will not recognize this model.

Example

1.1 build a yolov3 model

  • Warming
    • if you get any error when running yolov3_to_onnx.py, try to reduce onnx version to 1.4.1. Like this:
    pip uninstall onnx
    pip install onnx==1.4.1
    
    • there are two ways to build yolov3 tensorrt engine, one is build it in the docker, another is in the /usr/local/TensorRT-5.1.5.0/sample/python/yolov3_to_onnx, the latter will be easier.

1.1.1 build in a docker image

  • follow the guide to build a yolov3 tensorrt engine
  • start a tensorrt container
docker run \
       -v $PWD/trt:/workspace/trt \
       --name trt \
       -ti nvcr.io/nvidia/tensorrt:19.10-py2 /bin/bash
  • build yolov3 model
# inside container trt
export TRT_PATH=/usr/src/tensorrt
cd $TRT_PATH/samples/python/yolov3_onnx/;

pip install wget
pip install onnx==1.5.0

# will automatic download the model and convert into onnx
python yolov3_to_onnx.py;

# build trtexec engine 
cd $TRT_PATH/samples/trtexec; 
make; cd ../../; 
./bin/trtexec --onnx=$TRT_PATH/samples/python/yolov3_onnx/yolov3.onnx --saveEngine=$TRT_PATH/model.plan 
# Average over 10 runs is 30.8623 ms (host walltime is 31.4395 ms, 99% percentile time is 31.9949)

  • copy model
# at your host
mkdir -p $model_path/yolov3_608_trt/1
docker cp trt:/usr/src/tensorrt/model.plan $model_path/yolov3_608_trt/1

1.1.2 build in TensorRT 5.1.5.0 (python3)

  • move to TensorRT-path/samples/python/yolov3_onnx/
  • pip install wget
  • pip install onnx==1.4.1
  • python yolov3_to_onnx.py
# build trtexec engine 
cd $TRT_PATH/samples/trtexec; 
make; cd ../../; 
./bin/trtexec --onnx=$TRT_PATH/samples/python/yolov3_onnx/yolov3.onnx --saveEngine=$TRT_PATH/model.plan 
# Average over 10 runs is 30.8623 ms (host walltime is 31.4395 ms, 99% percentile time is 31.9949)

1.1.3 write a configuration file

# $model_path/yolov3_608_trt/config.pbtxt
name: "yolov3_608_trt"
platform: "tensorrt_plan"
max_batch_size: 1
dynamic_batching {
  preferred_batch_size: [1]
  max_queue_delay_microseconds: 100
}
input [
  {
    name: "000_net"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 608, 608 ]
  }
]
output [
  {
    name: "082_convolutional"
    data_type: TYPE_FP32
    dims: [ 255, 19, 19 ]
  },
  {
    name: "094_convolutional"
    data_type: TYPE_FP32
    dims: [ 255, 38, 38 ]
  },
  {
    name: "106_convolutional"
    data_type: TYPE_FP32
    dims: [ 255, 76, 76 ]
  }
]
instance_group [
  {
    count:2
    kind: KIND_GPU
  }
]

1.2 build a python client

  • reference
# download client python library
# https://github.com/NVIDIA/tensorrt-inference-server/releases
# for example:
# wget https://github.com/NVIDIA/tensorrt-inference-server/releases/download/v1.7.0/v1.7.0_ubuntu1604.clients.tar.gz;
# tar xvzf v1.7.0_ubuntu1604.clients.tar.gz;

apt-get install curl libcurl4-openssl-dev
apt-get install python python-pip
pip install --user --upgrade python/tensorrtserver*.whl numpy pillow
python image_client.py -m yolov3_608_trt ~/mayday.jpg

Reference


  1. CSDN blog ↩︎

  2. Nvidia trtis blog ↩︎

你可能感兴趣的:(Framework)