在华为昇腾Ascend910上复现swin_transformer

一、参考资料

利用MindSpore复现ICCV2021 Best Paper Swin Trasnformer

二、运行环境

硬件平台
在这里插入图片描述

[ma-user swin_transformer]$export -p
declare -x AICPU_FLAG="1"
declare -x ANACONDA_DIR="/home/ma-user/anaconda3"
declare -x ASCEND_AICPU_PATH="/usr/local/Ascend/nnae/latest/"
declare -x ASCEND_HOME="/usr/local/Ascend"
declare -x ASCEND_OPP_PATH="/usr/local/Ascend/nnae/latest/opp"
declare -x CONDA3_DIR="/opt/conda/bin"
declare -x CONDA_DIR="/opt/conda"
declare -x CPU_LIMIT="24.0"
declare -x CREDENTIAL_PROFILES_FILE="/etc/secret-volume/credentials"
declare -x FWK_PYTHON_PATH="/usr/local/Ascend/nnae/latest/fwkacllib/python/site-packages"
declare -x GATEWAY_IMAGE_ID=""
declare -x HOME="/home/ma-user"
declare -x JUPYTER_SERVER_ROOT="/home/ma-user/work"
declare -x LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu/hdf5/serial:/usr/local/Ascend/nnae/latest/fwkacllib/lib64:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver:/usr/lib64"
declare -x LESSOPEN="||/usr/bin/lesspipe.sh %s"
declare -x LINES="40"
declare -x LOGNAME="ma-user"
declare -x ME_TBE_PLUGIN_PATH="/usr/local/Ascend/opp/framework/built-in/tensorflow/"
declare -x MINICONDA_DIR="/opt/conda"
declare -x MODELARTS_SDK_VERSION="1.1.5"
declare -x MS_BUILD_PROCESS_NUM="20"
declare -x MindsporeEnv="MindSpore"
declare -x NOTEBOOK_IMAGE_NAME="notebook2.0-mul-kernel-arm-ascend-cp37"
declare -x NOTEBOOK_IMAGE_VERSION="3.3.3-c79-20220121"
declare -x OLDPWD="/home/ma-user"
declare -x OPTION_EXEC_EXTERN_PLUGIN_PATH="/usr/local/Ascend/fwkacllib/lib64/plugin/opskernel/libfe.so:/usr/local/Ascend/fwkacllib/lib64/plugin/opskernel/libaicpu_plugin.so:/usr/local/Ascend/fwkacllib/lib64/plugin/opskernel/librts_engine.so:/usr/local/Ascend/fwkacllib/lib64/plugin/opskernel/libge_local_engine.so"
declare -x PATH="/home/ma-user/.local/bin:/home/ma-user/bin:/usr/local/Ascend/nnae/latest/fwkacllib/ccec_compiler/bin:/usr/local/Ascend/nnae/latest/fwkacllib/bin:/home/ma-user/anaconda3/envs/MindSpore/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/ma-user/modelarts/ma-cli/bin"
declare -x PWD="/home/ma-user/work/swin_transformer"
declare -x PYTHONPATH="/usr/local/Ascend/nnae/latest/fwkacllib/python/site-packages:/usr/local/Ascend/nnae/latest/fwkacllib/python/site-packages/auto_tune.egg:/usr/local/Ascend/nnae/latest/fwkacllib/python/site-packages/schedule_search.egg:/usr/local/Ascend/tfplugin/latest/tfplugin/python/site-packages:/usr/local/Ascend/nnae/latest/opp/op_impl/built-in/ai_core/tbe"
declare -x RANK_TABLE_FILE="/user/config/nbstart_hccl.json"
declare -x REGION_NAME="cn-central-231"
declare -x SHELL="/bin/bash"
declare -x SOC_VERSION="Ascend910A"
declare -x TBE_IMPL_PATH="/usr/local/Ascend/nnae/latest/opp/op_impl/built-in/ai_core/tbe"
declare -x TERM="xterm"
declare -x TF_PLUGIN_PKG="/usr/local/Ascend/tfplugin/latest/tfplugin/python/site-packages"
declare -x TensorflowEnv="TensorFlow-1.15.0"
declare -x USER="ma-user"
declare -x USERNAME="ma-user"
declare -x XDG_CACHE_HOME="/home/ma-user/.cache"

三、相关介绍

1. 相关教程

modelzoo.wiki

2. Atlas硬件产品兼容性查询

计算产品兼容性查询助手

四、关键步骤

swin_transformer gitee仓库

ModelZoo:Swin-Transformer

博主的代码:https://gitee.com/lljyoyo1995/swin_transformer

1. 资源占用情况

CPU占用情况
在华为昇腾Ascend910上复现swin_transformer_第1张图片

资源占用情况
在华为昇腾Ascend910上复现swin_transformer_第2张图片

NPU占用情况
在华为昇腾Ascend910上复现swin_transformer_第3张图片

NPU功率
在华为昇腾Ascend910上复现swin_transformer_第4张图片

五、FAQ

Q:MindSpore版本不一致导致API接口错误

[ERROR] MD(67033,fffea5ffb1e0,python):2022-07-20-17:51:48.714.669 [mindspore/ccsrc/minddata/dataset/util/task.cc:67] operator()] Task: MapOp(ID:3) - thread(281469171773920) is terminated with err msg: Unexpected error. map operation: [Decode] failed. Decode: invalid input shape, only support 1D input, got rank: 3
Line of code : 42
File         : /home/jenkins/agent-working-dir/workspace/Compile_Ascend_ARM_CentOS@2/mindspore/mindspore/ccsrc/minddata/dataset/kernels/image/decode_op.cc

    self.saved_output_shapes = runtime_getter[0].GetOutputShapes()
RuntimeError: Unexpected error. map operation: [Decode] failed. Decode: invalid input shape, only support 1D input, got rank: 3
Line of code : 42
File         : /home/jenkins/agent-working-dir/workspace/Compile_Ascend_ARM_CentOS@2/mindspore/mindspore/ccsrc/minddata/dataset/kernels/image/decode_op.cc
错误原因:
MindSpore版本更新之后,部分接口

解决办法:
1. 修改导包方式
import mindspore.dataset.vision as vision
改为
import mindspore.dataset.vision.c_transforms as vision
import mindspore.dataset.vision.py_transforms as pvision

2. 修改ToPIL()接口
vision.ToPIL()
改为
pvision.ToPIL()
# 注释掉Decode()
# vision.Decode()
[ERROR] MD(71480,fffe5f7fe1e0,python):2022-07-20-18:01:47.677.517 [mindspore/ccsrc/minddata/dataset/util/task.cc:67] operator()] Task: MapOp(ID:3) - thread(281467988992480) is terminated with err msg: Unexpected error. map operation: [Normalize] failed. Normalize: number of channels does not match the size of mean and std vectors, got channels: 224, size of mean:3
Line of code : 810
File         : /home/jenkins/agent-working-dir/workspace/Compile_Ascend_ARM_CentOS@2/mindspore/mindspore/ccsrc/minddata/dataset/kernels/image/image_utils.cc
错误原因:
Normalize()接口错误
MindSpore版本更新较快,新旧接口不一致导致源代码无法运行

解决办法:
1. 修改导包方式
import mindspore.dataset.transforms as C
import mindspore.dataset.vision as vision
改为
import mindspore.dataset.transforms.c_transforms as C
import mindspore.dataset.vision.c_transforms as vision
import mindspore.dataset.vision.py_transforms as pvision

2. 修改Normalize()接口
vision.Normalize(mean=mean, std=std)
改为
pvision.Normalize(mean=mean, std=std)

你可能感兴趣的:(深度学习,transformer,Ascend910,华为昇腾)