OpenVINO使用说明

本文档基于openvino_2022.2。

一.简介

OpenVINO™ 工具包是一个综合工具包,用于快速开发解决各种任务的应用程序和解决方案,包括模拟人类视觉、自动语音识别、自然语言处理、推荐系统等。

2018年发布,开源、商用免费。

OpenVINOTM 2022.2版本特性

1.OpenVINO™ 工具包

  • 在边缘启用基于 CNN 的深度学习推理
  • 支持跨英特尔® CPU、英特尔® 集成显卡、英特尔® 神经计算棒 2 和英特尔® 视觉加速器设计与英特尔® Movidius™ VPU 的异构执行
  • 通过易于使用的计算机视觉功能库和预先优化的内核加快上市时间
  • 包括对计算机视觉标准的优化调用,包括 OpenCV* 和 OpenCL™

2.OpenVINO™ 工具包组件

  • 深度学习模型优化器

    • 跨平台的命令行工具包,支持导入来自主流的深度学习框架的模型,模型文件可能来自tensorflow、pytorch、caffe、MXNet、ONNX等深度学习框架与工具生成。模型优化器支持对导入模型的转换、优化、导出中间格式文件。
  • 深度学习推理引擎

    • 一组统一的 C++/Python API函数,允许在许多硬件类型上进行高性能推理,包括英特尔® CPU、英特尔® 集成显卡、英特尔® 神经计算棒 2、采用英特尔® Movidius™ 视觉处理单元 (VPU) 的英特尔® 视觉加速器设计。
  • 推理引擎示例:一组简单的控制台应用程序,演示如何在您的应用程序中使用推理引擎。

  • 深度学习工作台(DL Workbench)

    • 基于 Web 的图形环境,是官方的 OpenVINO™ 图形界面,旨在使预训练深度学习计算机视觉和自然语言处理模型的生成变得更加容易。
  • 训练后优化工具:用于校准模型然后以 INT8 精度执行它的工具。

  • 附加工具:一组用于处理模型的工具,包括Benchmark App、Cross Check Tool、Compile tool。

  • Open Model Zoo

    • 包括针对各种视觉问题的深度学习解决方案,包括对象识别、人脸识别、姿势估计、文本检测和动作识别
    • 附加工具:一组用于处理模型的工具,包括Accuracy Checker Utility和Model Downloader。
    • 预训练模型文档:Open Model Zoo github仓库中提供的预训练模型文档。
      • 模型输入视频样例
    • Tensorflow预训练模型库
  • Deep Learning Streamer (DL Streamer):基于 GStreamer 的流分析框架,用于构建媒体分析组件的图形。DL Streamer 可以通过英特尔® Distribution of OpenVINO™ 工具包安装程序进行安装。

  • OpenCV:为英特尔® 硬件编译的 OpenCV 社区版本

  • 英特尔® 媒体 SDK:(仅在面向 Linux 的英特尔® OpenVINO™ 工具套件分发版中)

3.OpenVINO™ 工具包工作流程

image
  • 支持部署设备
    • Intel® CPU (e.g. Intel® Core™ i7-1165G7)
    • dGPU (e.g. Intel® Iris® Xe MAX) 集成显卡
    • iGPU (e.g. Intel® UHD Graphics 620 (iGPU)) 独立显卡
    • Intel® Movidius™ Myriad™ X VPU (e.g. Intel® Neural Compute Stick 2 (Intel® NCS2))
    • GNA (处理器集成的高斯和神经加速器):旨在提供人工智能语音和音频应用程序,例如神经噪声消除。

二、安装OpenVINO组件

image

1.环境依赖

  • 操作系统
    • Ubuntu 18.04 long-term support (LTS), 64-bit
    • Ubuntu 20.04 long-term support (LTS), 64-bit
  • 硬件设备
    • 6th to 12th generation Intel® Core™ processors and Intel® Xeon® processors
    • 3rd generation Intel® Xeon® Scalable processor (formerly code named Cooper Lake)
    • Intel® Xeon® Scalable processor (formerly Skylake and Cascade Lake)
    • Intel Atom® processor with support for Intel® Streaming SIMD Extensions 4.1 (Intel® SSE4.1)
    • Intel Pentium® processor N4200/5, N3350/5, or N3450/5 with Intel® HD Graphics
    • Intel® Iris® Xe MAX Graphics
    • Intel® Neural Compute Stick 2
    • Intel® Vision Accelerator Design with Intel® Movidius™ VPUs

2.下载与安装

到Intel® Distribution of OpenVINO™ Toolkit下载选择下载openvino development tools或openvino runtime。

1)PIP 安装OpenVINO Development Tools

安装OpenVINO Development Tools会一并安装OpenVINO Runtime。

# Step 1: Create and activate virtual environment
python3 -m venv openvino_env
source openvino_env/bin/activate
# Step 2: Upgrade pip to latest version
python -m pip install --upgrade pip
# Step 3: Download and install the package
pip install openvino-dev[ONNX,tensorflow2,mxnet,kaldi,caffe,pytorch]==2022.2.0

# 在当前目录会出现openvino_env文件夹
$ tree openvino_env/ -L 2
openvino_env/
├── bin
│   ├── accuracy_check
│   ├── activate
│   ├── activate.csh
│   ├── activate.fish
│   ├── Activate.ps1
│   ├── backend-test-tools
│   ├── benchmark_app       # 评估模型
│   ├── check-model
│   ├── check-node
│   ├── convert_annotation
│   ├── convert-caffe2-to-onnx
│   ├── convert-onnx-to-caffe2
│   ├── cpuinfo
│   ├── easy_install
│   ├── easy_install-3.8
│   ├── estimator_ckpt_converter
│   ├── f2py
│   ├── f2py3
│   ├── f2py3.8
│   ├── google-oauthlib-tool
│   ├── huggingface-cli
│   ├── imagecodecs
│   ├── imageio_download_bin
│   ├── imageio_remove_bin
│   ├── import_pb_to_tensorboard
│   ├── lsm2bin
│   ├── markdown_py
│   ├── mo                              # Model Optimizer
│   ├── nib-conform
│   ├── nib-convert
│   ├── nib-dicomfs
│   ├── nib-diff
│   ├── nib-ls
│   ├── nib-nifti-dx
│   ├── nib-roi
│   ├── nib-stats
│   ├── nib-tck2trk
│   ├── nib-trk2tck
│   ├── nltk
│   ├── normalizer
│   ├── omz_converter                   # Open Model Zoo工具:预训练模型转IR文件
│   ├── omz_data_downloader     # Open Model Zoo工具:下载数据
│   ├── omz_downloader              # Open Model Zoo工具:下载预训练模型
│   ├── omz_info_dumper
│   ├── omz_quantizer
│   ├── opt_in_out
│   ├── parrec2nii
│   ├── pip
│   ├── pip3
│   ├── pip3.10
│   ├── pip3.8
│   ├── pot                                 # Post-training Optimization Tool 
│   ├── pydicom
│   ├── pyrsa-decrypt
│   ├── pyrsa-encrypt
│   ├── pyrsa-keygen
│   ├── pyrsa-priv2pub
│   ├── pyrsa-sign
│   ├── pyrsa-verify
│   ├── python -> python3
│   ├── python3 -> /usr/bin/python3
│   ├── saved_model_cli
│   ├── skivi
│   ├── tensorboard
│   ├── tflite_convert
│   ├── tf_upgrade_v2
│   ├── tiff2fsspec
│   ├── tiffcomment
│   ├── tifffile
│   ├── toco
│   ├── toco_from_protos
│   ├── tqdm
│   ├── transformers-cli
│   └── wheel
├── include
├── lib
│   └── python3.8
├── lib64 -> lib
├── pyvenv.cfg
└── share
    ├── doc
    └── python-wheels
        ├── appdirs-1.4.3-py2.py3-none-any.whl
        ├── CacheControl-0.12.6-py2.py3-none-any.whl
        ├── certifi-2019.11.28-py2.py3-none-any.whl
        ├── chardet-3.0.4-py2.py3-none-any.whl
        ├── colorama-0.4.3-py2.py3-none-any.whl
        ├── contextlib2-0.6.0-py2.py3-none-any.whl
        ├── distlib-0.3.0-py2.py3-none-any.whl
        ├── distro-1.4.0-py2.py3-none-any.whl
        ├── html5lib-1.0.1-py2.py3-none-any.whl
        ├── idna-2.8-py2.py3-none-any.whl
        ├── ipaddr-2.2.0-py2.py3-none-any.whl
        ├── lockfile-0.12.2-py2.py3-none-any.whl
        ├── msgpack-0.6.2-py2.py3-none-any.whl
        ├── packaging-20.3-py2.py3-none-any.whl
        ├── pep517-0.8.2-py2.py3-none-any.whl
        ├── pip-20.0.2-py2.py3-none-any.whl
        ├── pkg_resources-0.0.0-py2.py3-none-any.whl
        ├── progress-1.5-py2.py3-none-any.whl
        ├── pyparsing-2.4.6-py2.py3-none-any.whl
        ├── requests-2.22.0-py2.py3-none-any.whl
        ├── retrying-1.3.3-py2.py3-none-any.whl
        ├── setuptools-44.0.0-py2.py3-none-any.whl
        ├── six-1.14.0-py2.py3-none-any.whl
        ├── toml-0.10.0-py2.py3-none-any.whl
        ├── urllib3-1.25.8-py2.py3-none-any.whl
        ├── webencodings-0.5.1-py2.py3-none-any.whl
        └── wheel-0.34.2-py2.py3-none-any.whl

$ pip list
Package                      Version
---------------------------- -----------
...
google-auth                  2.13.0
google-auth-oauthlib         0.4.6
google-pasta                 0.2.0
...
keras                        2.9.0
...
mxnet                        1.7.0.post2
...
numpy                        1.23.1
onnx                         1.11.0
opencv-python                4.6.0.66
openvino                     2022.2.0
openvino-dev                 2022.2.0
openvino-telemetry           2022.1.1
...
scikit-image                 0.19.3
scikit-learn                 0.24.2
scipy                        1.5.4
...
tensorboard                  2.9.1
tensorboard-data-server      0.6.1
tensorboard-plugin-wit       1.8.1
tensorflow                   2.9.1
tensorflow-estimator         2.9.0
tensorflow-io-gcs-filesystem 0.27.0
...
torch                        1.8.1
torchvision                  0.9.1
tqdm                         4.64.1
transformers                 4.23.1
...

2)Docker 安装OpenVINO Development Tools

通过docker hub获取镜像。

# Intel CPU
docker run -it --rm openvino/ubuntu18_dev
# Intel GPU
docker run -it --rm --device /dev/dri openvino/ubuntu18_dev
# NCS2(單個VPU)
docker run -it --rm --device-cgroup-rule='c 189:* rmw' -v /dev/bus/usb:/dev/bus/usb openvino/ubuntu18_dev
# HDDL(多個VPU)
docker run -it --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp openvino/ubuntu18_dev

容器说明

# 默认进入工作目录,如/opt/intel/openvino_2022.2.0.7713
$ tree -L 2
.
|-- docs
|   |-- OpenVINO-GetStarted-online.html
|   |-- OpenVINO-Install-Linux-online.html
|   |-- OpenVINO-OpenVX-documentation.html
|   |-- OpenVINO-documentation-online.html
|   |-- licensing
|-- extras
|   |-- opencv
|-- install_dependencies
|   |-- 97-myriad-usbboot.rules
|   |-- install_NCS_udev_rules.sh
|   |-- install_NEO_OCL_driver.sh
|   |-- install_openvino_dependencies.sh
|-- licensing
|   |-- DockerImage_readme.txt
|   |-- third-party-programs-docker-dev.txt
|   |-- third-party-programs-docker-runtime.txt
|-- python
|   |-- python3.6
|   |-- python3.7
|   |-- python3.8
|   |-- python3.9
|-- runtime
|   |-- 3rdparty
|   |-- cmake
|   |-- include
|   |-- lib
|   |-- version.txt
|-- samples
|   |-- c
|   |   |-- CMakeLists.txt
|   |   |-- build_samples.sh
|   |   |-- common
|   |   |-- hello_classification
|   |   |-- hello_nv12_input_classification
|   |-- cpp
|   |   |-- CMakeLists.txt
|   |   |-- benchmark_app
|   |   |-- build
|   |   |-- build_samples.sh
|   |   |-- classification_sample_async
|   |   |-- common
|   |   |-- hello_classification
|   |   |-- hello_nv12_input_classification
|   |   |-- hello_query_device
|   |   |-- hello_reshape_ssd
|   |   |-- model_creation_sample
|   |   |-- samples_bin
|   |   |-- speech_sample
|   |   |-- thirdparty
|   |-- python
|       |-- classification_sample_async
|       |-- hello_classification
|       |-- hello_query_device
|       |-- hello_reshape_ssd
|       |-- model_creation_sample
|       |-- requirements.txt
|       |-- setup.cfg
|       |-- speech_sample
|-- setupvars.sh
|-- tools
    |-- cl_compiler
    |-- compile_tool
    |-- deployment_manager
    |-- requirements.txt
    |-- requirements_caffe.txt
    |-- requirements_kaldi.txt
    |-- requirements_mxnet.txt
    |-- requirements_onnx.txt
    |-- requirements_pytorch.txt
    |-- requirements_tensorflow.txt
    |-- requirements_tensorflow2.txt
    
$ ls /usr/local/bin/omz*
/usr/local/bin/omz_converter        /usr/local/bin/omz_downloader   /usr/local/bin/omz_quantizer
/usr/local/bin/omz_data_downloader  /usr/local/bin/omz_info_dumper

OpenVINO™ 工具套件组件对比

2021 2022
Inference Engine Runtime 进化为OpenVINO™ Runtime
Samples 保留,进行了精简,移除了与OMZ demo中重复的示例,且只保留用于理解API用法的示例
Dev tools,含MO, POT, DLWB,以及OMZ中的下载、转换等工具[注2] 不再默认包含,需要单独通过pip进行安装
非Dev tools,含deployment manager, compile_tool等 保留
OpenCV 不再默认包含,需要通过单独提供的脚本下载和安装
DL Workbench的下载安装脚本 从安装包中移除,单独通过pip安装
DL Streamer 从安装包中移除,单独通过APT进行安装
Media SDK Media SDK进化为One VPL[注3],从安装包中移除
Demo应用(来自于OMZ) 从安装包中移除
image
image

3)Docker安装dl workbench

https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Locally.html#windows

# Manage Docker as a non-root user
$ sudo groupadd docker
$ sudo usermod -aG docker $USER
$ newgrp docker # activate the changes to groups
$ docker ps

$ docker pull openvino/workbench:latest 
$ docker run -p 0.0.0.0:5665:5665 --name workbench -it --rm openvino/workbench:latest
waiting for server to start..... done
server started
waiting for server to shut down..... done
server stopped
[Workbench] PostgreSQL init process complete.
[Workbench] PostgreSQL applying migrations...
waiting for server to start..... done
server started

打开浏览器,输入http://127.0.0.1:5665.

三、使用OpenVINO组件

1.使用openvino-dev容器

基于openvino development tools。

1)容器基础使用

# 初始化openvino环境变量
$ source /opt/intel/openvino/setupvars.sh
# 初始化openvino-opencv环境变量,否则无法拉流
$ source /opt/intel/openvino/extras/opencv/setupvars.sh 

# 查看设备信息
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_query_device
$ python3 hello_query_device.py
[ INFO ] Available devices:
[ INFO ] CPU :
[ INFO ]    SUPPORTED_PROPERTIES:
[ INFO ]        AVAILABLE_DEVICES: 
[ INFO ]        RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
[ INFO ]        RANGE_FOR_STREAMS: 1, 8
[ INFO ]        FULL_DEVICE_NAME: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
[ INFO ]        OPTIMIZATION_CAPABILITIES: WINOGRAD, FP32, FP16, INT8, BIN, EXPORT_IMPORT
[ INFO ]        CACHE_DIR: 
[ INFO ]        NUM_STREAMS: 1
[ INFO ]        AFFINITY: Affinity.CORE
[ INFO ]        INFERENCE_NUM_THREADS: 0
[ INFO ]        PERF_COUNT: False
[ INFO ]        INFERENCE_PRECISION_HINT: 
[ INFO ]        PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ]        PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] GPU :
[ INFO ]    SUPPORTED_PROPERTIES:
[ INFO ]        AVAILABLE_DEVICES: 0
[ INFO ]        RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 2, 1
[ INFO ]        RANGE_FOR_STREAMS: 1, 2
[ INFO ]        OPTIMAL_BATCH_SIZE: 1
[ INFO ]        MAX_BATCH_SIZE: 1
[ INFO ]        FULL_DEVICE_NAME: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU)
[ INFO ]        DEVICE_UUID: UNSUPPORTED TYPE
[ INFO ]        DEVICE_TYPE: Type.INTEGRATED
[ INFO ]        DEVICE_GOPS: UNSUPPORTED TYPE
[ INFO ]        OPTIMIZATION_CAPABILITIES: FP32, BIN, FP16, INT8
[ INFO ]        GPU_DEVICE_TOTAL_MEM_SIZE: UNSUPPORTED TYPE
[ INFO ]        GPU_UARCH_VERSION: 12.0.0
[ INFO ]        GPU_EXECUTION_UNITS_COUNT: 96
[ INFO ]        GPU_MEMORY_STATISTICS: UNSUPPORTED TYPE
[ INFO ]        PERF_COUNT: False
[ INFO ]        MODEL_PRIORITY: Priority.MEDIUM
[ INFO ]        GPU_HOST_TASK_PRIORITY: Priority.MEDIUM
[ INFO ]        GPU_QUEUE_PRIORITY: Priority.MEDIUM
[ INFO ]        GPU_QUEUE_THROTTLE: Priority.MEDIUM
[ INFO ]        GPU_ENABLE_LOOP_UNROLLING: True
[ INFO ]        CACHE_DIR: 
[ INFO ]        PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ]        COMPILATION_NUM_THREADS: 8
[ INFO ]        NUM_STREAMS: 1
[ INFO ]        PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]        DEVICE_ID: 0

2)运行openvino样例

OpenVINO Samples

直接运行python样例:

# CPU
docker run -it --rm 
/bin/bash -c "cd ~ && omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 /opt/intel/openvino/samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp CPU"

# GPU
docker run -itu root:root  --rm --device /dev/dri:/dev/dri 
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp GPU"

# MYRIAD
docker run -itu root:root --rm --device-cgroup-rule='c 189:\* rmw' -v /dev/bus/usb:/dev/bus/usb 
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp MYRIAD"

# HDDL
docker run -itu root:root --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp -v /dev/shm:/dev/shm 
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && umask 000 && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp HDDL"

编译运行C++样例:

# 容器中
$ cd /opt/intel/openvino_2022.2.0.7713/samples/cpp
# 编译样例
$ ./build_samples.sh
$ tree samples_bin/
samples_bin/
|-- benchmark_app
|-- classification_sample_async
|-- hello_classification
|-- hello_nv12_input_classification
|-- hello_query_device
|-- hello_reshape_ssd
|-- model_creation_sample
|-- speech_sample

3)命令行使用

# 查看可获取的预训练模型
$ omz_downloader --print_all
Sphereface
aclnet
aclnet-int8
action-recognition-0001
age-gender-recognition-retail-0013
alexnet
......
yolo-v3-onnx
yolo-v3-tf
yolo-v3-tiny-onnx
yolo-v3-tiny-tf
yolo-v4-tf
yolo-v4-tiny-tf
yolof
yolox-tiny

# 测试openvino运行模型
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_classification/
# 1.下载预训练模型
$ omz_downloader --name alexnet
$ tree public/alexnet/
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig

# 2.转化模型
$ omz_converter  --name alexnet
$ tree public/alexnet/
|-- FP16
|   |-- alexnet.bin
|   |-- alexnet.mapping
|   -- alexnet.xml
|-- FP32
|   |-- alexnet.bin
|   |-- alexnet.mapping
|   -- alexnet.xml
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig

# 3.运行模型
$ curl -O https://storage.openvinotoolkit.org/data/test_data/images/banana.jpg
$ python3 hello_classification.py public/alexnet/FP16/alexnet.xml banana.jpg CPU/GPU/AUTO

# 4.模型基准测试
$ benchmark_app -m public/alexnet/FP16/alexnet.xml -i  banana.jpg -d CPU/GPU -niter 128 -api sync/async
Latency:
    Median:     34.38 ms
    AVG:        34.60 ms
    MIN:        19.57 ms
    MAX:        69.10 ms
Throughput: 115.30 FPS

# 容器资源占用
CONTAINER ID   NAME                    CPU %     MEM USAGE / LIMIT     MEM %     NET I/O   BLOCK I/O        PIDS
f8e127db77d8   openvino-ubuntu18_dev   786.64%   1.324GiB / 7.383GiB   17.93%    0B / 0B   483MB / 74.4MB   19

# 查看Intel GPU消耗
sudo apt-get install -y intel-gpu-tools
sudo intel_gpu_top
intel-gpu-top -    0/   0 MHz;  100% RC6; ----- (null);        0 irqs/s

      IMC reads:   ------ (null)/s
     IMC writes:   ------ (null)/s

          ENGINE      BUSY                                                                          MI_SEMA MI_WAIT
     Render/3D/0    99.65% |                                                                       |      0%      0%
       Blitter/0    0.00% |                                                                       |      0%      0%
         Video/0    0.00% |                                                                       |      0%      0%
         Video/1    0.00% |                                                                       |      0%      0%
  VideoEnhance/0    0.00% |                                                                       |      0%      0%

2.使用openvino_notebooks样例

https://github.com/openvinotoolkit/openvino_notebooks/blob/main/README_cn.md

1)容器中安装jupyter

参考远程服务器(ubuntu20.04)+docker容器内jupyter远程使用

基于上述的openvino-dev容器环境中安装jupyter和启动jupyter-notebook。

apt-get update 
apt-get install vim

pip install jupyter
# 生成jupyter notebook的配置文件
jupyter-notebook --generate-config

# 修改配置文件
vim ~/.jupyter/jupyter_notebook_config.py
    # 允许通过任意绑定服务器的ip访问
    c.NotebookApp.ip = '*'
     # 用于访问的端口
    c.NotebookApp.port = 8888  #注意这里与前面开出的容器端口要一致
     # 不自动打开浏览器
    c.NotebookApp.open_browser = False
     #允许远程访问
    c.NotebookApp.allow_remote_access = True 
    
# 启动jupyter
$ jupyter notebook -ip 0.0.0.0 --allow-root --port 8888 --no-browser
        ......
        http://127.0.0.1:8888/?token=xxx

使用浏览器访问notebook,输入token登录:如 http://192.168.1.10:8888/

2)下载并使用openvino_notebooks工程样例

apt-get install git
cd ~; git clone https://github.com/openvinotoolkit/openvino_notebooks

在浏览器的notebook中打开样例的ipynb文件即可,如:openvino_notebooks/notebooks/001-hello-world/001-hello-world.ipynb。

四、模型处理

OpenVINO™ 支持多种模型格式,并允许将它们转换为自己的 OpenVINO IR。

1.OpenVINO模型处理工具

https://docs.openvino.ai/latest/omz_tools_downloader.html

  • mo:模型优化器可以将预训练深度学习模型:TensorFlow、PyTorch、PaddlePaddle、MXNet、Caffe、Kaldi 或 ONNX 转换为 OpenVINO 中间表示格式 (IR)。
    • .xml - 描述整个模型拓扑,每个阶层,相连性和参数值。
    • .bin - 包含每层已经训练好的权值和偏移值。
    • 包含功能:
      • Convert(转换)
      • Optimize(优化)
      • Conversion weights and offsets(转换权重与偏置)
  • pot:训练后优化工具可以在推理过程中将权重和激活从浮点精度量化到整数精度(例如,8 位)。
    • 不同的硬件平台支持不同的整数精度和量化参数,POT 通过引入“目标设备”的概念来抽象这种复杂性。
    • 需要一个未标注的数据集进行量化。
  • Open Model Zoo工具:针对Open Model Zoo的模型进行一键化处理。
    • omz_downloader:从在线资源下载模型文件。
    • omz_converter:将其他格式模型装换为IR格式模型。
    • omz_quantizer:将 IR 格式的全精度模型量化为低精度版本。
    • omz_info_dumper:以稳定的机器可读格式打印有关模型的信息。
    • omz_data_downloader:从安装位置复制数据集的数据。
  • benchmark_app:Benchmark C++ Tool 在支持的设备上评估深度学习推理性能。

2.各类模型格式转换

Supported Model Formats

  • OpenVINO IR(中间表示):OpenVINO™ 的专有格式。

  • ONNX、PaddlePaddle:直接支持的格式,OpenVINO 提供 C++ 和 Python API 用于将它们直接导入 OpenVINO Runtime,无需任何事先转换。

  • TensorFlow、PyTorch、MXNet、Caffe、Kaldi:间接支持的格式,它们需要转换为前面列出的格式之一。使用模型优化器执行从这些格式到 OpenVINO IR 的转换。在某些情况下,需要使用其他转换器作为中介。

1)mo参数说明

# 可选参数:
 --framework {onnx,mxnet,tf,kaldi,caffe,paddle}
# 与框架无关的参数:
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
  --model_name MODEL_NAME, -n MODEL_NAME
                        输出IR文件名
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
  --input_shape INPUT_SHAPE
                        模型输入节点的shape,也可以使用--input参数设置input_shape
  --scale SCALE, -s SCALE
                        原始网络中所有的input会除以这个值。
  --reverse_input_channels
                        转换通道,从RGB→BGR
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT        
                        带""的字符串,用逗号分隔的输入节点信息,包括名称、形状、数据类型等。
  --output OUTPUT       
                        指定模型的输出节点
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        对输入图像的每一个通道设置mean值
  --scale_values SCALE_VALUES
                        对输入图像的每一个通道设置scale值
  --source_layout SOURCE_LAYOUT
                        Layout of the input or output of the model in the framework. Layout can be specified in
                        the short form, e.g. nhwc, or in complex form, e.g. "[n,h,w,c]". Example for many names:
                        "in_name1([n,h,w,c]),in_name2(nc),out_name1(n),out_name2(nc)". Layout can be partially
                        defined, "?" can be used to specify undefined layout for one dimension, "..." can be used
                        to specify undefined layout for multiple dimensions, for example "?c??", "nc...", "n...c",
                        etc.
  --target_layout TARGET_LAYOUT
                        Same as --source_layout, but specifies target layout that will be in the model after
                        processing by ModelOptimizer.
  --layout LAYOUT       Combination of --source_layout and --target_layout. Can't be used with either of them. If
                        model has one input it is sufficient to specify layout of this input, for example --layout
                        nhwc. To specify layouts of many tensors, names must be provided, for example: --layout
                        "name1(nchw),name2(nc)". It is possible to instruct ModelOptimizer to change layout, for
                        example: --layout "name1(nhwc->nchw),name2(cn->nc)". Also "*" in long layout form can be
                        used to fuse dimensions, for example "[n,c,...]->[n*c,...]".
  --data_type {FP16,FP32,half,float}
                        数据类型,该参数决定了模型的精度。
  --transform TRANSFORM
                        Apply additional transformations. Usage: "--transform
                        transformation_name1[args],transformation_name2..." where [args] is key=value pairs
                        separated by semicolon. Examples: "--transform LowLatency2" or "--transform
                        LowLatency2[use_const_initializer=False]" or "--transform "MakeStateful[param_res_names={'
                        input_name_1':'output_name_1','input_name_2':'output_name_2'}]"" Available
                        transformations: "LowLatency2", "MakeStateful"
  --disable_fusing      [DEPRECATED] Turn off fusing of linear operations to Convolution.
  --disable_resnet_optimization
                        [DEPRECATED] Turn off ResNet optimization.
  --finegrain_fusing FINEGRAIN_FUSING
                        [DEPRECATED] Regex for layers/operations that won't be fused. Example: --finegrain_fusing
                        Convolution1,.*Scale.*
  --enable_concat_optimization
                        [DEPRECATED] Turn on Concat optimization.
  --extensions EXTENSIONS
                        Paths or a comma-separated list of paths to libraries (.so or .dll) with extensions. For
                        the legacy MO path (if `--use_legacy_frontend` is used), a directory or a comma-separated
                        list of directories with extensions are supported. To disable all extensions including
                        those that are placed at the default location, pass an empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that correspond to log level equals ERROR, that
                        can be set with the following option: --log_level. By default, log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided value, for example:
                        "node_name->True". It will be DEPRECATED in future releases. Use --input option to specify
                        a value for freezing.
  --static_shape        Enables IR generation for fixed input shape (folding `ShapeOf` operations and shape-
                        calculating sub-graphs to `Constant`). Changing model input shape using the OpenVINO
                        Runtime API in runtime may fail for such an IR.
  --disable_weights_compression
                        [DEPRECATED] Disable compression and store weights with original precision.
  --progress            Enable model conversion progress display.
  --stream_output       Switch model conversion progress display to a multiline mode.
  --transformations_config TRANSFORMATIONS_CONFIG
                        Use the configuration file with transformations description. File can be specified as
                        relative path from the current directory, as absolute path or as arelative path from the
                        mo root directory
  --use_new_frontend    Force the usage of new Frontend of Model Optimizer for model conversion into IR. The new
                        Frontend is C++ based and is available for ONNX* and PaddlePaddle* models. Model optimizer
                        uses new Frontend for ONNX* and PaddlePaddle* by default that means `--use_new_frontend`
                        and `--use_legacy_frontend` options are not specified.
  --use_legacy_frontend
                        Force the usage of legacy Frontend of Model Optimizer for model conversion into IR. The
                        legacy Frontend is Python based and is available for TensorFlow*, ONNX*, MXNet*, Caffe*,
                        and Kaldi* models.

2)转换ONNX模型

mo --input_model .onnx

3)转换PaddlePaddle模型

mo --input_model .pdmodel
# 示例
mo --input_model=yolov3.pdmodel --input=image,im_shape,scale_factor --input_shape=[1,3,608,608],[1,2],[1,2] --reverse_input_channels --output=save_infer_model/scale_0.tmp_1,save_infer_model/scale_1.tmp_1

4)转换PyTorch模型

PyTorch模型先导出ONNX模型,再转为OpenVINO IR。

import torch

# Instantiate your model. This is just a regular PyTorch model that will be exported in the following steps.
model = SomeModel()
# Evaluate the model to switch some operations from training mode to inference.
model.eval()
# Create dummy input for the model. It will be used to run the model inside export function.
dummy_input = torch.randn(1, 3, 224, 224)
# Call the export function
torch.onnx.export(model, (dummy_input, ), 'model.onnx')
  • 从 PyTorch 1.8.1 版开始,并非所有 PyTorch 操作都可以导出到默认使用的 ONNX opset 9。当导出到默认 opset 9 不起作用时,建议将模型导出到 opset 11 或更高版本。

5)转换Caffe模型

mo --input_model .caffemodel

# 针对Caffe 的特定参数:
    --input_proto INPUT_PROTO,-d INPUT_PROTO
            包含拓扑的部署就绪 prototxt 文件
            结构和层属性
    --caffe_parser_path CAFFE_PARSER_PATH
            从 caffe.proto 生成的 python Caffe 解析器的路径
    -k K    指定自定义层映射文件 CustomLayersMapping.xml 
    --disable_omitting_optional
            禁用忽略可选属性(用于自定义图层)。如果要转移自定义层的所有属性到 IR,请使用此选项。默认行为是将具有默认值的属性和用户定义的属性传递给 IR。
    --enable_flattening_nested_params
            启用展平可选参数(用于自定义图层)。如果要将自定义层的属性传输到具有展平嵌套参数的 IR,请使用此选项。默认行为是在不展平嵌套参数的情况下传输属性。

# 示例
mo --input_model bvlc_alexnet.caffemodel --input_proto bvlc_alexnet.prototxt
    # 如果caffemodel与prototxt在相同路径,则指定input_model即可。
    
mo --input_model bvlc_alexnet.caffemodel -k CustomLayersMapping.xml --disable_omitting_optional --enable_flattening_nested_params

6)转换TensorFlow模型

  • 针对TensorFlow*的特定参数

      --input_model_is_text
                            TensorFlow*: treat the input model file as a text protobuf format. If not specified, the
                            Model Optimizer treats it as a binary file by default.
      --input_checkpoint INPUT_CHECKPOINT
                            TensorFlow*: variables file to load.
      --input_meta_graph INPUT_META_GRAPH
                            Tensorflow*: a file with a meta-graph of the model before freezing
      --saved_model_dir SAVED_MODEL_DIR
                            TensorFlow*: directory with a model in SavedModel format of TensorFlow 1.x or 2.x version.
      --saved_model_tags SAVED_MODEL_TAGS
                            Group of tag(s) of the MetaGraphDef to load, in string format, separated by ','. For tag-
                            set contains multiple tags, all tags must be passed in.
      --tensorflow_custom_operations_config_update TENSORFLOW_CUSTOM_OPERATIONS_CONFIG_UPDATE
                            TensorFlow*: update the configuration file with node name patterns with input/output nodes
                            information.
      --tensorflow_use_custom_operations_config TENSORFLOW_USE_CUSTOM_OPERATIONS_CONFIG
                            Use the configuration file with custom operation description.
      --tensorflow_object_detection_api_pipeline_config TENSORFLOW_OBJECT_DETECTION_API_PIPELINE_CONFIG
                            TensorFlow*: path to the pipeline configuration file used to generate model created with
                            help of Object Detection API.
      --tensorboard_logdir TENSORBOARD_LOGDIR
                            TensorFlow*: dump the input graph to a given directory that should be used with
                            TensorBoard.
      --tensorflow_custom_layer_libraries TENSORFLOW_CUSTOM_LAYER_LIBRARIES
                            TensorFlow*: comma separated list of shared libraries with TensorFlow* custom operations
                            implementation.
      --disable_nhwc_to_nchw
                            [DEPRECATED] Disables the default translation from NHWC to NCHW. Since 2022.1 this option
                            is deprecated and used only to maintain backward compatibility with previous releases.
    
  • 针对TensorFlow 1 Models

    # Converting Frozen Model Format
    mo --input_model .pb
    
    # Converting Non-Frozen Model Formats
    # 1.Checkpoint存储格式:包含inference_graph.pb和checkpoint_file.ckpt文件
    mo --input_model .pb --input_checkpoint 
    # 2.MetaGraph储存格式:包含model_name.meta, model_name.index, model_name.data-00000-of-00001和checkpoint_file.ckpt【可选】
    mo --input_meta_graph .meta
    # 3.SavedModel储存格式:一个文件夹中包含.pb文件,variables、assets 和 assets.extra子文件夹
    mo --saved_model_dir 
    
    • 导出Frozen Model Format

      import tensorflow as tf
      from tensorflow.python.framework import graph_io
      frozen = tf.compat.v1.graph_util.convert_variables_to_constants(sess, sess.graph_def, ["name_of_the_output_node"])
      graph_io.write_graph(frozen, './', 'inference_graph.pb', as_text=False)
      
  • 针对TensorFlow 2 Models

    • SavedModel储存格式:一个文件夹中包含.pb文件和 variables 、assets子文件夹

      mo --saved_model_dir 
      
    • Keras H5储存格式,需要先将其序列化为SavedModel格式。

      import tensorflow as tf
      model = tf.keras.models.load_model('model.h5')
      tf.saved_model.save(model,'model')
      

7)转换Mxnet模型

  • 针对TensorFlow*的特定参数

    Mxnet-specific parameters:
      --input_symbol INPUT_SYMBOL
                            Symbol file (for example, model-symbol.json) that contains a topology structure and layer
                            attributes
      --nd_prefix_name ND_PREFIX_NAME
                            Prefix name for args.nd and argx.nd files.
      --pretrained_model_name PRETRAINED_MODEL_NAME
                            Name of a pretrained MXNet model without extension and epoch number. This model will be
                            merged with args.nd and argx.nd files
      --save_params_from_nd
                            Enable saving built parameters file from .nd files
      --legacy_mxnet_model  Enable MXNet loader to make a model compatible with the latest MXNet version. Use only if
                            your model was trained with MXNet version lower than 1.0.0
      --enable_ssd_gluoncv  Enable pattern matchers replacers for converting gluoncv ssd topologies.
    

3.训练后优化

https://docs.openvino.ai/latest/pot_compression_cli_README.html

# Basic usage for DefaultQuantization
pot -q default -m  -w  --ac-config 

# Basic usage for AccuracyAwareQauntization
pot -q accuracy_aware -m  -w  --ac-config  --max-drop 0.01

五、OpenVINO推理

Integrate OpenVINO™ with Your Application

1.使用openvino.runtime api开发

1)同步推理流程

image
  1. 创建Core对象;

    from openvino.runtime import Core, Type, Layout
    core = Core()
    
    # 查看可用设备【可选】
    devices = ie.available_devices
    for device in devices:
        device_name = ie.get_property(device, "FULL_DEVICE_NAME")
        print(f"{device}: {device_name}")
    

    CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
    GNA: GNA_SW
    GPU: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU)

  2. 载入并编译模型;

    # 读取模型文件,model_path为 .xml files 或 .onnx file 
    model = core.read_model(model_path)
    
    # 获取模型输入输出信息【可选】
    input_layer = model.input(0)
    output_layer = model.output(0)
    print(f"input precision: {input_layer.element_type}")
    print(f"input shape: {input_layer.shape}")
    print(f"output precision: {output_layer.element_type}")
    print(f"output shape: {output_layer.shape}")
    
    # 集成预处理步骤到模型【可选】
     # 参考:使用openvino.preprocess api开发
        
    # 将模型文件编译到指定的设备:device_name='CPU/GPU', config可选
    compiled_model = core.compile_model(model, device_name, config)
    

    input precision:
    input shape: {1, 3, 224, 224}

    output precision:
    output shape: {1, 1001}

  3. 执行同步推理获得结果;

    # 方法1:程序预处理, blob为模型输入数据 
    import numpy as np
    image = cv2.imread(image_path)
    N, C, H, W = input_layer.shape
    resized_image = cv2.resize(src=image, dsize=(W, H))
    input_data = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0).astype(np.float32)
    result = compiled_model([input_data])[output_layer]  # 阻塞推理
    
    
    # 方法2: 模型集成预处理
    image = cv2.imread(image_path)
    # Add N dimension
    input_tensor = np.expand_dims(image, 0)
    results = compiled_model.infer_new_request({0: input_tensor})    # 阻塞推理
    

2)异步推理流程

image
  1. 加载模型步骤与前面一致

  2. 执行异步推理获得结果

    from openvino.runtime import AsyncInferQueue, Core, InferRequest, Layout, Type
    
    # Read input images
    images = [cv2.imread(image_path) for image_path in args.input]
    # Add N dimension
    input_tensors = [np.expand_dims(image, 0) for image in resized_images]
    
    # create async queue with optimal number of infer requests
    infer_queue = AsyncInferQueue(compiled_model)
    infer_queue.set_callback(completion_callback)
    
    for i, input_tensor in enumerate(input_tensors):
        # 执行异步推理
     infer_queue.start_async({0: input_tensor}, args.input[i])   # 非阻塞
    
    # 等待推理结束
    infer_queue.wait_all()
    

    ...
    # 创建一个推理请求负责处理当前帧
    infer_request_curr = net.create_infer_request()
    # 创建一个推理请求负责处理下一帧
    infer_request_next = net.create_infer_request()
    
    # Get the current frame,采集当前帧图像
    frame_curr = cv2.imread("./data/images/bus.jpg")
    # Preprocess the frame,对当前帧做预处理
    letterbox_img_curr, _, _ = letterbox(frame_curr, auto=False)
    # Normalization + Swap RB + Layout from HWC to NCHW
    blob = Tensor(cv2.dnn.blobFromImage(letterbox_img_curr, 1/255.0, swapRB=True))  
    
    # 将数据传入模型的指定输入节点
    infer_request_curr.set_tensor(input_node, blob)
    # 调用start_sync(),以非阻塞方式启动当前帧推理计算
    infer_request_curr.start_async()
    while True:    
        # 下一帧推理请求数据blob准备
       # 将数据传入下一帧推理请求
        infer_request_next.set_tensor(input_node, blob)
        # 调用start_sync(),以非阻塞的方式启动下一帧推理计算
        infer_request_next.start_async()
        
        # 等待当前帧推理请求结束
        infer_request_curr.wait()
        # 从 output_node获取当前帧推理结果
        infer_result = infer_request_curr.get_tensor(output_node)
        # Postprocess the inference result
        data = torch.tensor(infer_result.data)
        
        # 交换当前帧推理请求和下一帧推理请求
        infer_request_curr, infer_request_next = infer_request_next, infer_request_curr
    

2.使用openvino.preprocess api开发

使用OpenVINO预处理API

OpenVINO™ 2022.1之后的预处理API可以将所有预处理步骤都集成到在执行图中,这样dGPU、VPU或iGPU都能进行数据预处理,无需依赖CPU。

1)数据预处理的典型操作

  1. 改变输入数据的形状:[720, 1280,3] → [1, 3, 640, 640]
  2. 改变输入数据的精度:U8 → f32
  3. 改变输入数据的颜色通道顺序:BGR → RGB
  4. 改变输入数据的布局(layout):HWC → NCHW
  5. 归一化数据:减去均值(mean),除以标准差(std)

2)OpenVINO预处理API主要流程

  1. 实例化PrePostProcessor对象

    from openvino.runtime import Core, Type, Layout
    from openvino.preprocess import PrePostProcessor, ColorFormat, ResizeAlgorithm
    
    core = Core()
    model = core.read_model(model_path)
    ppp = PrePostProcessor(model)
    
  2. 声明输入张量的信息

    image = cv2.imread(image_path)
    # Add N dimension
    input_tensor = np.expand_dims(image, 0)  # 例如:input_tensor.shape = [1,640,640,3]
    
    ppp.input().tensor() \
        .set_shape([1,640,640,3]) \      # 图像的尺寸,按照'NHWC'的顺序写
        .set_color_format(ColorFormat.BGR) \
        .set_element_type(Type.u8) \
        .set_layout(Layout('NHWC'))  
    
  3. 指定模型的数据布局(layout)

    # 模型输入的数据布局为NCHW
    ppp.input().model().set_layout(Layout('NCHW'))
    
    # 模型输出的数据布局为NHWC【可选】
    ppp.output().model().set_layout(Layout('NHWC'))
    
  4. 声明输出张量的信息

    ppp.output().tensor() \
     .set_element_type(Type.f32)     # 输出张量的精度为f32
        .set_layout(Layout('NHWC'))      # 可选
    
  5. 定义预处理的具体步骤

    # 或 自定义前处理步骤
    ppp.input().preprocess() \
        .convert_element_type(Type.f32) \
        .convert_color(ColorFormat.RGB) \    # 将输入图像从BGR格式转化为RGB格式
        .resize(ResizeAlgorithm.RESIZE_LINEAR, 224, 224) # 例如模型输入尺寸是[1,3,224,224]
        .mean([0.0, 0.0, 0.0]) \
        .scale([255.0, 255.0, 255.0]) \
        .convert_layout([0, 3, 1, 2])        # 将'NHWC'转化为'NCHW'
    
    • OpenVINO支持的前处理操作步骤
      • convert_color、convert_element_type、convert_layout、crop、mean、resize、reverse_channels、scale、custom
  6. 定义后处理的具体步骤【可选】

    ppp.output().postprocess() \
        .convert_element_type(Type.f32) \
        ..convert_layout([0, 3, 1, 2])       # 将'NHWC'转化为'NCHW'
    
    • OpenVINO支持的后处理操作步骤
      • convert_element_type、convert_layout、custom
  7. 将预处理步骤集成到模型

    model = ppp.build()
    

  8. 将集成了预处理步骤的模型导出【可选】

    from openvino.offline_transformations import serialize
    serialize(model, 'xxx.xml', 'xxx.bin')
    

3.Auto-Device及Automatic Batching插件

OpenVINOTM 2022.1中AUTO插件和自动批处理的最佳实践

1)Auto-Device

AUTO Device (简称 Automatic device selection) 是一个构建在CPU/GPU插件之上的虚拟插件,它不绑定到特定类型的设备,它可以是受支持的CPU、GPU、VPU(视觉处理单元)或 GNA(高斯神经加速器协处理器)或这些设备的组合。

image

优点:

  • 根据深度学习模型和所选设备的特性以最佳配置使用它们。
  • 使 GPU 实现更快的首次推理延迟:GPU 插件需要在开始推理之前在运行时进行在线模型编译。当选择独立或集成GPU时,“AUTO”插件开始会首先利用CPU进行推理,以隐藏此GPU模型编译时间。
  • 使用简单,开发者只需将compile_model()方法的device_name参数指定为“AUTO”即可。

设备切换逻辑:

  • AUTO插件会依据设备优先级: dGPU > iGPU > VPU > CPU 来选择最佳计算设备。当自动插件选择 GPU 作为最佳设备时,会发生推理设备切换,以隐藏首次推理延迟。

不同设备支持的精度

SupportedDevice Supportedmodel precision
dGPU(e.g. Intel® Iris® Xe MAX) FP32, FP16, INT8, BIN
iGPU(e.g. Intel® UHD Graphics 620 (iGPU)) FP32, FP16, BIN
Intel® Movidius™ Myriad™ X VPU(e.g. Intel® Neural Compute Stick 2 (Intel® NCS2)) FP16
Intel® CPU(e.g. Intel® Core™ i7-1165G7) FP32, FP16, INT8, BIN

2)Automatic Batching

自动批处理(Automatic Batching) 将用户程序发出的多个异步推理请求组合起来,将它们视为多批次推理请求,并将批推理结果拆解后,返回给各推理请求。

当compile_model()方法的config参数设置为{“PERFORMANCE_HINT”: ”THROUGHPUT”}时,OpenVINOTM Runtime会自动启动自动批处理执行。

  • PERFORMANCE_HINT 应用场景 是否启动Auto Batching?
    THROUGHPUT 非实时的大批量推理计算任务
    LATENCY 实时或近实时应用任务
compiled_model = core.compile_model(model="xxx.onnx", device_name="AUTO", \
                                   config={"PERFORMANCE_HINT": "THROUGHPUT", 'ALLOW_AUTO_BATCHING': 'YES'})

4.C++推理示例

#include 
#include 
#include 
#include 
#include 

// clang-format off
#include "openvino/openvino.hpp"

#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/classification_results.h"
#include "samples/slog.hpp"
#include "format_reader_ptr.h"
// clang-format on

/**
 * @brief Main with support Unicode paths, wide strings
 */
int tmain(int argc, tchar* argv[]) {
    try {
        // -------- Step 1. Initialize OpenVINO Runtime Core --------
        ov::Core core;

        // -------- Step 2. Read a model --------
        std::shared_ptr model = core.read_model(model_path);
        printInputAndOutputsInfo(*model);

        // -------- Step 3. Set up input

        // Read input image to a tensor and set it to an infer request
        // without resize and layout conversions
        FormatReader::ReaderPtr reader(image_path.c_str());
        if (reader.get() == nullptr) {
            std::stringstream ss;
            ss << "Image " + image_path + " cannot be read!";
            throw std::logic_error(ss.str());
        }

        ov::element::Type input_type = ov::element::u8;
        ov::Shape input_shape = {1, reader->height(), reader->width(), 3};
        std::shared_ptr input_data = reader->getData();

        // just wrap image data by ov::Tensor without allocating of new memory
        ov::Tensor input_tensor = ov::Tensor(input_type, input_shape, input_data.get());

        const ov::Layout tensor_layout{"NHWC"};

        // -------- Step 4. Configure preprocessing --------

        ov::preprocess::PrePostProcessor ppp(model);

        // 1) Set input tensor information:
        // - input() provides information about a single model input
        // - reuse precision and shape from already available `input_tensor`
        // - layout of data is 'NHWC'
        ppp.input().tensor().set_shape(input_shape).set_element_type(input_type).set_layout(tensor_layout);
        // 2) Adding explicit preprocessing steps:
        // - convert layout to 'NCHW' (from 'NHWC' specified above at tensor layout)
        // - apply linear resize from tensor spatial dims to model spatial dims
        ppp.input().preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
        // 4) Here we suppose model has 'NCHW' layout for input
        ppp.input().model().set_layout("NCHW");
        // 5) Set output tensor information:
        // - precision of tensor is supposed to be 'f32'
        ppp.output().tensor().set_element_type(ov::element::f32);

        // 6) Apply preprocessing modifying the original 'model'
        model = ppp.build();

        // -------- Step 5. Loading a model to the device --------
        ov::CompiledModel compiled_model = core.compile_model(model, device_name);

        // -------- Step 6. Create an infer request --------
        ov::InferRequest infer_request = compiled_model.create_infer_request();
        // -----------------------------------------------------------------------------------------------------

        // -------- Step 7. Prepare input --------
        infer_request.set_input_tensor(input_tensor);

        // -------- Step 8. Do inference synchronously --------
        infer_request.infer();

        // -------- Step 9. Process output
        const ov::Tensor& output_tensor = infer_request.get_output_tensor();

        // Print classification results
        ClassificationResult classification_result(output_tensor, {image_path});
        classification_result.show();
        // -----------------------------------------------------------------------------------------------------
    } catch (const std::exception& ex) {
        std::cerr << ex.what() << std::endl;
        return EXIT_FAILURE;
    }

    return EXIT_SUCCESS;
}

推理模式

  • 自动设备选择 (AUTO):检测可用设备,选择最适合该任务的设备,并配置其优化设置。这样可以编写一次应用程序并将其部署到任何地方。

    • 从 CPU 开始执行推理,继续将模型加载到最适合该目的的设备,并在准备好时将任务转移给它。

      • 使用CPU可以减少首次推理延时。
  • 多设备执行 (MULTI)

  • 异构执行 (HETERO):允许在多个设备上执行一个模型的推理。

  • 自动批处理执行(Auto-batching):通过将推理请求分组在一起来提高设备利用率,而无需用户进行编程。

参考

  • OpenVINO官方教程
  • OpenVINO系列
  • 使用openvino加速推理
  • 英特尔边缘计算社区
    • 基于OpenVINOTM2022.2和蝰蛇峡谷优化并部署YOLOv5模型
    • 在蝰蛇峡谷上实现YOLOv5模型的OpenVINO异步推理程序
    • 将OpenVINOTM推理结果通过MQTT推送给EdgeX Foundry

你可能感兴趣的:(OpenVINO使用说明)