tx2 jetpack4 cuda10 没有mkl python2
开启tx2最大功率和小风扇
sudo nvpmodel -m 0 # 切换工作模式到最大
cd /usr/bin/
sudo ./jetson_clocks # 强制开启风扇最大转速
import torch
print(torch.__version__)
print('CUDA available: ' + str(torch.cuda.is_available()))
a = torch.cuda.FloatTensor(2).zero_()
print('Tensor a = ' + str(a))
b = torch.randn(2).cuda()
print('Tensor b = ' + str(b))
c = a + b
print('Tensor c = ' + str(c))
参考: PyTorch for Jetson Nano
https://devtalk.nvidia.com/default/topic/1049071/jetson-nano/pytorch-for-jetson-nano/
环境: PyTorch 1.1, Python 2.7 and Python 3.6 , Jetson TX2, JetPack 4.2.
Python 2.7
wget https://nvidia.box.com/shared/static/m6vy0c7rs8t1alrt9dqf7yt1z587d1jk.whl -O torch-1.1.0a0+b457266-cp27-cp27mu-linux_aarch64.whl
pip install torch-1.1.0a0+b457266-cp27-cp27mu-linux_aarch64.whl
Python 3.6
wget https://nvidia.box.com/shared/static/veo87trfaawj5pfwuqvhl6mzc5b55fbj.whl -O torch-1.1.0a0+b457266-cp36-cp36m-linux_aarch64.whl
pip3 install numpy torch-1.1.0a0+b457266-cp36-cp36m-linux_aarch64.whl
# 安装torchvision,原创找不到了,吊大的说一下
sudo pip install --no-deps torchvision==0.2.0
然后用torch执行项目测试遇到这么个报错:
RuntimeError: cuda runtime error (7) : too many resources requested for launch at /home/nvidia/Downloads/pytorch/aten/src/THCUNN/generic/SpatialUpSamplingBilinear.cu:66
解决:
解决方法是将pytorch源码中的 CUDA_NUM_THREADS =256
改了两个文件:
- pytorch/aten/src/THCUNN/common.h 12行
- pytorch/aten/src/ATen/cuda/detail/KernelUtils.h 15行
改用源码安装,撸起袖子开始干/v/
参考:Jetson TX2安装pytorch(from source) https://www.jianshu.com/p/9e9c74834283
git clone 速度慢, 网上找到方法说是dns污染,怒改dns为114.114.114.114,或者8.8.8.8,果然单车变摩托
/etc/hosts,我也改了,增加如下:
151.101.72.249 github.global.ssl.fastly.net
192.30.253.112 github.com
sudo apt install libopenblas-dev libatlas-dev liblapack-dev
# 遇到缺什么依赖就装什么依赖,下面这些我就感觉cmake有用,但是还是全装上吧
sudo pip install scipy pyyaml scikit-build cffi
sudo apt-get -y install cmake
sudo gedit ~/.bashrc
# add end
export CUDNN_LIB_DIR=/usr/lib/aarch64-linux-gnu
export CUDNN_INCLUDE_DIR=/usr/include
export CUDA_ROOT="/usr/local/cuda-10.0/"
export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64/:$LD_LIBRARY_PATH"
# 为了保险起见,我在/etc/profile 也增加了上面四行
git clone http://github.com/pytorch/pytorch
cd pytorch
sudo pip install -U setuptools
sudo pip install -r requirements.txt
git checkout tags/v1.1.0 -b build
git submodule update --init --recursive
这个时候会遇到最后一条命令怎么等待进度条都不会增加,令人捉鸡。更改dns。速度上窜了一大截,恢复正常网速。
最后一条一定要执行结束,不然安装的时候会报错。
ps:编译之前先修改两个文件的1024为256
sudo python setup.py build
这一步报错,
Failed to run 'bash ../tools/build_pytorch_libs.sh --use-cuda --use-nnpack --use-mkldnn --use-qnnpack caffe2'
参考:ubuntu 16.04 Caffe2 / PyTorch - CMake Error at third_party/protobuf/cmake/cmake_install.cmake:64 https://blog.csdn.net/chengyq116/article/details/83817726
解决:
sudo python setup.py install
# 执行的时候报错,发现是`git submodule update --init --recursive`中的少了某个
# 文件,上github下载相应的文件,编译成功
python -c "import torch"
# 报错如下
ImportError: No module named _C
参考:PyTorch源码安装小记 https://blog.csdn.net/Draco_mystack/article/details/71191924
查了下pytorch repo的issues,果然很多人遇到:https://github.com/pytorch/pytorch/issues/7
作者淡定说,不要在pytorch项目根目录下导入torch……
然后就可以了。
执行项目代码出现同样错误:
RuntimeError: cuda runtime error (7) : too many resources requested for launch at
相当于之前的源码(改1024为256)安装解决失败,重新google,解决如下:
RuntimeError: cuda runtime error (7) : too many resources requested for launch at #8103 https://github.com/pytorch/pytorch/issues/8103
# 更改"aten/src/THCUNN/generic/SpatialUpSamplingBilinear.cu":
# Around line 62:
# 注释 THCState_getCurrentDeviceProperties(state)->maxThraedsPerBlock;
# 改为
const int num_threads = 512;
# Around line 97
# 注释 THCState_getCurrentDeviceProperties(state)->maxThraedsPerBlock;
# 改为
const int num_threads = 512;
执行项目代码,遇到其他错误,先记录,以待修复,这个就好象是项目里面的代码问题了:
UnboundLocalError: local variable 'pred3' referenced before assignment