来源IMX_LINUX_USERS_GUIDE
The NXP® eIQTM for i.MX toolkit provides a set of libraries and development tools for machine learning applications
targeting NXP microcontrollers and application processors. The toolkit is contained in the meta-imx/meta-ml layer.
For details about Machine Learning Security, see Security for Machine Learning Package (AN12867)
Image classification demo
This demo performs image classification using a pretrained SqueezeNet network. Demo dependencies from
../opencv_extra-4.2.0/testdata/dnn
YOLO object detection example
The YOLO object detection demo performs object detection using You Only Look Once (YOLO ) detector. It detects objects
on camera, video, or image. Find out more information about this demo at OpenCV Yolo DNNs page. Demo dependencies
from
../opencv_extra-4.2.0/testdata/dnn
Image segmentation demo
The image segmentation means dividing the image into groups of pixels based on some criteria grouping based on color,
texture, or some other criteria. Demo dependencies from
../opencv_extra-4.2.0/testdata/dnn
Image colorization demo
This sample demonstrates recoloring grayscale images with DNN. The demo supports input images only, not the live camera
input. Demo dependencies from
../opencv_extra-4.2.0/testdata/dnn
Human pose detection demo
This application demonstrates human or hand pose detection with a pretrained OpenPose DNN. The demo supports input
images only and no live camera input. Demo dependencies from
../opencv_extra-4.2.0/testdata/dnn
Object Detection Example
This demo performs object detection using a pretrained SqueezeDet network. The demo supports input images only, not the
live camera input. Demo dependencies are the following:
• SqueezeDet.caffemodel model weight file
• SqueezeDet_deploy.prototxt model definition file
• Input image aeroplane.jpg
Running the C++ example with image input from the default location:
./example_dnn_objdetect_obj_detect SqueezeDet_deploy.prototxt SqueezeDet.caffemodel
aeroplane.jpg
CNN image classification example
This demo performs image classification using a pretrained SqueezeNet network. The demo supports input images only, not
the live camera input. Demo dependencies are the following:
• SqueezeNet.caffemodel model weight file
Text detection
This demo is used for text detection in the image using EAST algorithm. Demo dependencies from
../opencv_extra-4.2.0/testdata/dnn
• frozen_east_text_detection.pb
Other demo dependencies are imageTextN.png from
/usr/share/OpenCV/samples/data
SVM Introduction
This example demonstrates how to create and train an SVM model using training data. Once the model is trained, labels for
test data are predicted. The full description of the example can be found in (tutorial_introduction_to_svm ). For displaying
the result, an image with Qt5 enabled is required.
After running the demo, the graphics result is shown on the screen:
./example_tutorial_introduction_to_svm
Prinicipal Component Analysis (PCA) introduction
Principal Component Analysis (PCA) is a statistical method that extracts the most important features of a dataset. In this
tutorial you will learn how to use PCA to calculate the orientation of an object. For more details, check the OpenCV tutorial
Introduction_to_PCA.
Logistic regression
In this sample, logistic regression is used for prediction of two characters (0 or 1) from an image. First, every image matrix is
reshaped from its original size of 28x28 to 1x784. A logistic regression model is created and trained on 20 images. After
training, the model can predict labels of test images. The source code is located on the logistic_regression link, and can be
run by typing the following command.
Demo dependencies (preparing the train data files):
wget https://raw.githubusercontent.com/opencv/opencv/4.2.0/samples/data/data01.xml
After running the demo, the graphics result is shown on the screen (it requires Qt5 support):
./example_cpp_logistic_regression
Arm Compute Library is a collection of low-level functions optimized for Arm CPU and GPU architectures targeted at image
processing, computer vision, and machine learning
TensorFlow Lite is a light-weight version of and a next step from TensorFlow. TensorFlow Lite is an open-source software
library focused on running machine learning models on mobile and embedded devices (available at www.tensorflow.org/
lite ). It enables on-device machine learning inference with low latency and small binary size. TensorFlow Lite also supports
hardware acceleration using Android OS Neural Networks API.
Features:
• Multithreaded computation with acceleration using Arm Neon SIMD instructions on Cortex-A cores
• Parallel computation using GPU/ML hardware acceleration (on shader or convolution units)
• C++ and Python API (supported Python version 3
Arm NN is an open-source inference engine framework developed by Linaro Artificial Intelligence Initiative, which NXP is a
part of and supporting a wide range of neural-network model formats, such as Caffe, TensorFlow, TensorFlow Lite, and
ONNX. For i.MX8, Arm NN is able to run on the CPU accelerated using Arm NEON (SIMD architecture extension for Arm
Cortex-A/R processors) and on GPUs/NPUs accelerated using the VSI NPU backend distributed exlusively as a component
of NXP® eIQTM. For more details about Arm NN, check the Arm NN SDK webpage.
Source codes in order to develop a custom application or build Arm NN are available on https://source.codeaurora.org/
external/imx/armnn-imx.
Arm NN SDK provides the following set of tests for Caffe models:
/usr/bin/CaffeAlexNet-Armnn
/usr/bin/CaffeCifar10AcrossChannels-Armnn
/usr/bin/CaffeInception_BN-Armnn
/usr/bin/CaffeMnist-Armnn
/usr/bin/CaffeResNet-Armnn
/usr/bin/CaffeVGG-Armnn
/usr/bin/CaffeYolo-Armnn
Two important limitations might require preprocessing of the Caffe model file prior to running an Arm NN Caffe test. First,
Arm NN tests require batch size to be set to 1. Second, Arm NN does not support all Caffe syntaxes, therefore some older
neural network model files will require updates to the latest Caffe syntax.
Details about how to perform these preprocessing steps are described on Arm NN GitHub page. Install Caffe on the host.
Also check Arm NN documentation for Caffe support.
ONNX Runtime version 1.1.2 with NXP improvements supports both ArmNN and the default CPU execution providers with
optimization level 2. In addition, ACL execution provider and optimization level 99 are provided as preview for a subset of
models (mobilenet v2, resnet50 v1 and v2).
NOTE
For the full list of the CPU supported operators, see the 'operator kernels' documentation
section: https://source.codeaurora.org/external/imx/onnxruntime-imx/tree/docs/
OperatorKernels.md
This section describes the steps to enable profiler and capture logs.
1. Stop the EVK board in U-Boot by pressing Enter.
2. Update mmcargs by adding galcore.showArgs=1 and galcore.gpuProfiler=1.
u-boot=> editenv mmcargs
edit: setenv bootargs ${jh_clk} console=${console} root=${mmcroot} galcore.showArgs=1
galcore.gpuProfiler=1
u-boot=> boot
3. Boot the board and wait for the Linux OS prompt.
4. The following environment flags should be enabled before executing the application. VIV_VX_DEBUG_LEVEL and
VIV_VX_PROFILE flags should always be 1 during the process of profiling. The CNN_PERF flag enables the driver’s
ability to generate perlayer profile log. NN_EXT_SHOW_PERF shows the details of how compiler estimates performance
and determines tiling based on it.
export CNN_PERF=1 NN_EXT_SHOW_PERF=1 VIV_VX_DEBUG_LEVEL=1 VIV_VX_PROFILE=1
5. Capture the profiler log. We use the sample ML example part of standard NXP Linux release to explain the following
section.
• Tensorflow-Lite profiling
Run the TFLite application with NPU backend as follows:
cd /usr/bin/tensorflow-lite-2.1.0/examples
./label_image -m mobilenet_v1_1.0_224_quant.tflite -t 1 -i grace_hopper.bmp -l
labels.txt -a 1 -v 0 > viv_test_app_profile.log 2>&1
• Armnn profiling
Run the ArmNN application (here TfMobilNet is taken as example) with NPU backend as follows:
/usr/bin/TfMobileNet-Armnn --data-dir=data --model-dir=models --compute=VsiNpu >
viv_test_app_profile.log 2>&1
The log captures detailed information of the execution clock cycles and DDR data transmission in each layer
主要特点
The STM32Cube.AI is fully integrated into STM32 software development ecosystem as an extension of the widely used STM32CubeMX tool.
It allows fast, automatic conversion of pre-trained ANNs into optimized code that can run on an MCU. The tool guides users through the selection of the right MCU and provides rapid feedback on the performance of the Neural Network in the chosen MCU, with validation running both on your PC and the target STM32 MCU. Check out our Getting Started video.
X-CUBE-AI is an STM32Cube Expansion Package part of the STM32Cube.AI ecosystem and extending STM32CubeMX capabilities with automatic conversion of pre-trained Neural Network and integration of generated optimized library into the user's project. The easiest way to use it is to download it inside the STM32CubeMX tool (version 5.0.1 or newer) as described in user manual Getting started with X-CUBE-AI Expansion Package for Artificial Intelligence (AI) (UM2526).
linux。。。待续
第一个是手写字识别 A7控制M4调用CUBE.AI生成的模型(Keras model model-ABC123-112.h5)
/**
*************************************************************************************************
* @file readme.txt
* @author MCD Application Team
* @brief Description of the Artificial Intelligence Hand Writing Character Recognition example.
*************************************************************************************************
*
* Copyright (c) 2019 STMicroelectronics. All rights reserved.
*
* This software component is licensed by ST under BSD 3-Clause license,
* the "License"; You may not use this file except in compliance with the
* License. You may obtain a copy of the License at:
* opensource.org/licenses/BSD-3-Clause
*
*************************************************************************************************
*/
This project demonstrate a complex application that is running on both CPU1(CA7) and CPU2(CM4).
The application is a launcher that recognize hand writing character drawn on the touch screen in order
to execute specific actions.
CPU1 (CA7) control the touch event and the Graphic User Interface.
CPU2 (CM4) is used to offload the processing of a Cube.AI pre-build Neural Network.
The communication between the CPU1(CA7) and the CPU2(CM4) is done through a Virtual UART to create an
Inter-Processor Communication channel seen as a TTY device in Linux.
The implementation is based on:
* RPMSG framework on CPU1(CA7) side
* and OpenAMP MW on the CPU2(CM4) side
OpenAMP MW uses the following HW resources
* IPCC peripheral for event signal (mailbox) between CPU1(CA7) and CPU2(CM4)
* MCUSRAM peripheral for buffer communications (virtio buffers) between CPU1(CA7) and CPU2(CM4)
Reserved shared memeory region for this example: SHM_ADDR=0x10040000 and SHM_SIZE=128k.
It is defined in platform_info.c file
A communication protocol has been defined between the CPU1(CA7 and the CPU2(CM4).
The data frames exchanged have the follwowing structure:
----------------------------------------------------------------
| msg ID | data Length | data Byte 1 | ... | data Byte n | CRC |
----------------------------------------------------------------
- 3 types of message could be received by CPU2(CM4):
* Set the Neural Network input type (0x20, 0x01, data, CRC)
* data = 0 => NN input is letter or digit
* data = 1 => NN input is letter only
* data = 2 => NN input is digit only
* Provide the touch screen coordinate (0x20, n, data_x1, data_y1, ... , data_xn, data_yn, CRC)
* n => the number of coordinate points
* data_xn => x coordinate of the point n
* data_yn => y coordinate of the point n
* Start ai nn processing (0x22, 0x00, CRC)
- 4 types of acknowledges could be received on CPU1(CA7) side:
* Bad acknowledge (0xFF, 0x00, CRC)
* Good acknowledge (0xF0, 0x00, CRC)
* Touch screen acknowledge (0xF0, 0x01, n, CRC)
* n => number of screen coordinate points acknowledged
* AI processing result acknowledge (0xF0, 0x04, char, accuracy, time_1, time_2, CRC)
* char => this is the recognized letter (or digit)
* accuracy => this is the confidence expressed in percentage
* time_1 => upper Bytes of the time (word) expressed in ms
* time_2 => lower Bytes of the time (word) expressed in ms
On CPU2(CM4) side:
- CPU2(CM4) initialize OPenAMP MW which initializes/configures IPCC peripheral through HAL
and setup openamp-rpmsg framwork infrastructure
- CPU2(CM4) creates 1 rpmsg channels for 1 virtual UART instance UART0
- CPU2(CM4) initialize the Character Recognition Neural Network
- CPU2(CM4) is waiting for messages from CPU1(CA7) on this channels
- When CPU2(CM4) receives a message on 1 Virtual UART instance/rpmsg channel, it processes the message
to execute the associated action:
* set the NN input type to the desire value
* or register the touch event coordinate to generate the picture that will be processed by the NN
* or start the NN processing and wait for the results
- On every previous action, the CPU(CM4) is sending back to the CPU1(CA7) and acknowledge already defined
above.
On CPU1(CA7) side:
- CPU1(CA7) open the input event to register the touch events generated by the user's finger drawing
- CPU1(CA7) configure the input type (Letter only) of the Neural Network running on the CPU2(CM4) by
sending a message throught the virtual TTY communication channel
- when the drawing is finished, CPU1(CA7) process the touch event data and send it to the CPU2(CM4)
- CPU1(CA7) start the Neural Network processing wait for the result and display the recognized character on
the diplay
Some information about the Character Recognition Neural Network:
The Character Recognition Neural Network used is a Keras model processed by Cube.AI to generate the executable
that can be run on the CPU2(CM4).
The Keras model used is located in the root directory of this project:
model-ABC123-112.h5
This model has been used in Cube.AI to generate the Neural Network binary.
The model accept as input a 28x28 picture encoded with float in black and white (black = 0.0 or White = 1.0).
The output layer of the Neural Network contains 36 neurons (A -> Z and 0 -> 9).
Notes:
- It requires Linux console to run the application.
- CM4 logging is redirected in Shared memory in MCUSRAM and can be displayed using following command:
cat /sys/kernel/debug/remoteproc/remoteproc0/trace0
Following command should be done in Linux console on CA7 to run the example :
> /usr/local/demo/bin/ai_char_reco_launcher /usr/local/demo/bin/apps_launcher_example.sh
You are ready to draw letter on the touch screen
Hardware and Software environment:
- This example runs on STM32MP157CACx devices.
- This example has been tested with STM32MP157C-DK2 and STM32MP157c-EVAL board and can be
easily tailored to any other supported device and development board.
Where to find the M4 firmware source code:
The M4 firmware source code is delivered as demonstration inside the STM32CubeMP1.
For the DK2 board:
/Firmware/Projects/STM32MP157C-DK2/Demonstrations/AI_Character_Recognition
For the EV1 board:
/Firmware/Projects/STM32MP157C-DK2/Demonstrations/AI_Character_Recognition
https://www.st.com/content/st_com/zh/products/embedded-software/mcu-mpu-embedded-software/stm32-embedded-software/stm32-mpu-openstlinux-expansion-packages/x-linux-ai.html
This version has been validated against the OpenSTLinux ecosystem release v2.0.0 and validated on STM32MP157x-DKx and STM32MP157x-EV1 boards. |
https://wiki.dh-electronics.com/index.php/Avenger96#Downloads
Description: Expansion Package that targets artificial intelligence for STM32MP1 Series devices.
本页列出了所有X-LINUX- AI应用程序示例。
|
|
|
|
|
|
|
|
|
|
https://github.com/STMicroelectronics/meta-st-stm32mpu-ai
OpenEmbedded meta layer to install AI frameworks and tools for the STM32MP1. It also provide application samples.
This version has been validated against the OpenSTLinux ecosystem release v2.0.0 and validated on STM32MP157x-DKx and STM32MP157x-EV1 boards.
X-LINUX-AI v2.0.0 expansion package:
https://wiki.st.com/stm32mpu/wiki/X-LINUX-AI_OpenSTLinux_Expansion_Package
https://wiki.st.com/stm32mpu/wiki/X-LINUX-AI_application_samples_zoo
百问:
http://wiki.100ask.org/index.php?title=STM32MP1_artificial_intelligence_expansio&variant=zh-mo
盘古:
Skip to end of banner
Go to start of banner
Skip to end of metadataGo to start of metadata
人工智能扩展包包含Linux AI框架,支持可以在STM32MP1系列设备上运行的AI应用程序示例。
该系统是通过添加了一个名为metaa-st-stm32mpu-ai的OpenEmbedded layer,它带来了一个完整且一致的易于构建/安装的环境,以利用STM32MP1系列上的AI。
该系统包含运行AI示例的框架、工具和应用程序。将针对不同的用例(如计算机视觉CV)提供不同的图像例程。
制作好的SD卡可以用来启动和运行Weston系统,所有数据保存在 SD存储卡内。
将Micro SD存储卡放入读卡器,插入到安装有Linux系统PC的USB接口。下面使用dd命令将Weston系统镜像写入Micro SD存储卡,/dev/sdb是对应的Micro SD卡设备(设备编号有差异,请谨慎选择)。
|
写入完成后,将Micro SD存储卡插入开发板的SD卡槽(J7),切换启动开关状态为SD卡启动。连接串口线和电源,即可看到Weston系统的启动信息。
AI演示启动程序是一个衍生的GTK launcher应用程序。
它用python3 编写,并使用GTK 作为显示用户界面。 它可以轻松启动AI应用程序示例。
触摸屏上的``单击''或将鼠标连接到板上的``单击''足以启动AI应用程序。