nvidia triton server 快速启动随记

0、环境

1)ubuntu20.04
2)docker
3)cuda 11.5
4)jetson4.6.1
5)T4 和驱动

1、quickstart:

1)NVIDIA Container Toolkit

curl https://get.docker.com | sh \
  && sudo systemctl --now enable docker

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add - \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
      
      curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
      
      如果出现Unsupported distribution! 设置distribution=ubuntu18.04 ,原因:20.04的版本中还没有这个。
      
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

2)server code

git clone https://github.com/triton-inference-server/server.git
cd server/docs/examples
./fetch_models.sh
model_repository=$(pwd)/model_repository

3)server docker 

docker pull nvcr.io/nvidia/tritonserver:22.03-py3
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/$model_repository:/models nvcr.io/nvidia/tritonserver:22.03-py3 tritonserver --model-repository=/models

4)test health

curl -v localhost:8000/v2/health/ready
输出:HTTP/1.1 200 OK

5)client examples

docker pull nvcr.io/nvidia/tritonserver:22.03-py3-sdk
/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
输出识别结果

2、model repository

1)model management
  model control:    NONE (dfault)
                    POLL  --model-control-mode=poll --repository-poll-secs = 100
                    EXPLICIT 支持model control protocol,HTTP/REST GRPC
 
  tritonserver --model-repository= --model-control-mode=none
  
2)repository layout:
  /
    /
      [config.pbtxt]
      [ ...]
      /
       
      /
       
      ...
    /
      [config.pbtxt]
      [ ...]
      /
       
      /
       
      ...
    ...
    
    模型目录名大于0的为有效版本
    
    eg: TensorRT model     
      /
            /
                config.pbtxt
                1/
                    model.plan
                    
    eg: ONNX Models
        /
            /
                config.pbtxt
                1/
                    model.onnx
                    
    eg: Python Models 
        /
            /
                config.pbtxt
                1/
                    model.py

3)Model Configuration    
    config.pbtxt
    curl localhost:8000/v2/models//config
    
    max_batch_size > 0 the full shape is formed as [ -1 ] + dims
    max_batch_size == 0 the full shape is formed as dims
    
    Auto-Generated Model Configuration
    --strict-model-config=false 
    
    

你可能感兴趣的:(triton,triton)