问题描述:
【功能模块】
在ascend310上,安装mindspore1.1.1后无法跑通样例
【操作步骤&问题现象】
我的软硬件平台是:Ascend310,配套软件包为toolkit20.2(内部的version.info中显示为1.75.22.3.220,python3.7.5,ubuntu-x86_64
1、按照官网安装toolkit,环境变量也设置完成
2、pip安装mindspore :pip3 install https://ms-release.obs.cn-north-4.myhuaweicloud.com/1.1.1/MindSpore/ascend/ascend310/ubuntu_x86/mindspore_ascend-1.1.1-cp37-cp37m-linux_x86_64.whl --trusted-host ms-release.obs.cn-north-4.myhuaweicloud.com -i https://pypi.tuna.tsinghua.edu.cn/simple
3、按照步骤跑ascend310_single_op_sample的样例,按照官网的命令,cmake和make后生成了可执行文件tensor_add_sample
运行./tensor_add_sample
结果显示:Build model failed
错误代码为:EB0000
【截图信息】
我的环境变量设置如下:
【日志信息】(可选,上传日志内容或者附件)
root@cc90a62a1814:/home/pzk/Projects/mindspore-docs-r1.1/docs/tutorials/tutorial_code/ascend310_single_op_sample# cmake . -DMINDSPORE_PATH=`pip3 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/pzk/Projects/mindspore-docs-r1.1/docs/tutorials/tutorial_code/ascend310_single_op_sample
root@cc90a62a1814:/home/pzk/Projects/mindspore-docs-r1.1/docs/tutorials/tutorial_code/ascend310_single_op_sample# make
[ 50%] Building CXX object CMakeFiles/tensor_add_sample.dir/main.cc.o
[100%] Linking CXX executable tensor_add_sample
[100%] Built target tensor_add_sample
root@cc90a62a1814:/home/pzk/Projects/mindspore-docs-r1.1/docs/tutorials/tutorial_code/ascend310_single_op_sample# ./tensor_add_sample
Traceback (most recent call last):
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/te/__init__.py", line 107, in
__import__('topi.cce')
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/topi/cce/__init__.py", line 20, in
import te.lang.cce
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/te/lang/cce/__init__.py", line 17, in
from .te_compute.broadcast_compute import broadcast
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/te/lang/cce/te_compute/__init__.py", line 23, in
from .broadcast_compute import broadcast
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/te/lang/cce/te_compute/broadcast_compute.py", line 19, in
from .util import dtype_check_decorator
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/te/lang/cce/te_compute/util.py", line 8, in
from te import platform as cceconf
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/te/platform/__init__.py", line 50, in
from .cce_build import get_pass_list, build_config
File "/home/HwHiAiUser/Ascend/ascend-toolkit/latest/x86_64-linux/atc/python/site-packages/te/platform/cce_build.py", line 195, in
split_intersection=False,
File "/home/HwHiAiUser/Ascend/ascend-toolkit/20.1.rc1/x86_64-linux/atc/python/site-packages/te/tvm/build_module.py", line 379, in build_config
config = make.node("BuildConfig", **node_args)
File "/home/HwHiAiUser/Ascend/ascend-toolkit/20.1.rc1/x86_64-linux/atc/python/site-packages/te/tvm/make.py", line 85, in node
return _Node(*args)
File "/home/HwHiAiUser/Ascend/ascend-toolkit/20.1.rc1/x86_64-linux/atc/python/site-packages/te/tvm/_ffi/_ctypes/function.py", line 209, in __call__
raise get_last_ffi_error()
tvm._ffi.base.TVMError: {'errCode': 'EB0000', 'message': 'require field enable_const_fold', 'traceback': 'Traceback (most recent call last):\n [bt] (6) /usr/local/Ascend/ascend-toolkit/20.2.rc1/x86_64-linux/atc/lib64/libtvm.so(TVMFuncCall+0x5e) [0x7fbddc40834e]\n [bt] (5) /usr/local/Ascend/ascend-toolkit/20.2.rc1/x86_64-linux/atc/lib64/libtvm.so(tvm::MakeNode(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)+0x11d) [0x7fbddbbb8dcd]\n [bt] (4) /usr/local/Ascend/ascend-toolkit/20.2.rc1/x86_64-linux/atc/lib64/libtvm.so(tvm::InitNodeByPackedArgs(tvm::runtime::Object*, tvm::runtime::TVMArgs const&)+0x750) [0x7fbddbbb6e00]\n [bt] (3) /usr/local/Ascend/ascend-toolkit/20.2.rc1/x86_64-linux/atc/lib64/libtvm.so(tvm::BuildConfigNode::VisitAttrs(tvm::AttrVisitor*)+0x1d8) [0x7fbddba966f8]\n [bt] (2) /usr/local/Ascend/ascend-toolkit/20.2.rc1/x86_64-linux/atc/lib64/libtvm.so(tvm::NodeAttrSetter::Visit(char const*, bool*)+0x1d) [0x7fbddbbbaa3d]\n [bt] (1) /usr/local/Ascend/ascend-toolkit/20.2.rc1/x86_64-linux/atc/lib64/libtvm.so(tvm::NodeAttrSetter::GetAttr(char const*)+0x1b7) [0x7fbddbbba537]\n [bt] (0) /usr/local/Ascend/ascend-toolkit/20.2.rc1/x86_64-linux/atc/lib64/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x45) [0x7fbddb8ddd05]\n File "../../../../../tensor_engine/src/node/reflection.cc", line 227\nBuildConfig: [EB0000] require field enable_const_fold'}
WARNING: Logging before InitGoogleLogging() is written to STDERR
[ERROR] ME(24718,tensor_add_sample):2021-03-30-18:54:23.228.734 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:138] BuildAirModel] Call aclgrphBuildInitialize fail.
[ERROR] ME(24718,tensor_add_sample):2021-03-30-18:54:23.228.824 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:204] operator()] Convert model from MindIR to OM failed
[ERROR] ME(24718,tensor_add_sample):2021-03-30-18:54:23.228.835 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:128] ChildProcess] Child process process failed
WARNING: Logging before InitGoogleLogging() is written to STDERR
[WARNING] ME(24717,tensor_add_sample):2021-03-30-18:54:23.276.069 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:200] HeartbeatThreadFuncInner] Peer stopped
[ERROR] ME(24717,tensor_add_sample):2021-03-30-18:54:23.276.897 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:184] operator()] Receive result model from child process failed
[ERROR] ME(24717,tensor_add_sample):2021-03-30-18:54:23.276.952 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:107] ParentProcess] Parent process process failed
[ERROR] ME(24717,tensor_add_sample):2021-03-30-18:54:24.277.213 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:217] LoadMindIR] Convert MindIR model to OM model failed
[ERROR] ME(24717,tensor_add_sample):2021-03-30-18:54:24.277.266 [mindspore/ccsrc/cxx_api/model/acl/acl_model.cc:54] Build] Load MindIR failed.
Build model failed.
解决方案:
libpython3.7m.so.1.0是python3.7的一个文件,这取决于你python是怎么装的
apt install python3.7
安装的,那么执行apt install python3.7-dev
即可(ubuntu为例,其他os可能不叫这个名字)--enable-shared
选项,如mkdir build && cd build && ./configure --enable-shared && make -j8 && make install
根据描述你使用的cann软件包版本为20.2,但看日志发现使用的te软件包的路径为
/home/HwHiAiUser/Ascend/ascend-toolkit/20.1.rc1/x86_64-linux/atc/python/site-packages/te
可能是机器中安装了多套cann软件包,可以通过调整PYTHONPATH
环境变量,
或干脆直接将atc下面的te和topi两个whl包安装到个人python环境中(这也是mindspore推荐的做法,参考官网安装方法选310):
pip install /usr/local/Ascend/ascend-toolkit/latest/atc/lib64/topi-{version}-py3-none-any.whl
pip install /usr/local/Ascend/ascend-toolkit/latest/atc/lib64/te-{version}-py3-none-any.whl
另外,我记得mindspore 1.1.1适配的cann版本内部版本号是1.76.xxxx来着。。。