ROCM + pytorch 快速安装方法
最新更新: 2022.10.10 由于pytorch官方已经支持rocm, 不需要自己编译,请访问
http://www.pytorch.org 进行安装
---------------------------------------------------------------------------------
需要在干净机器上安装
原始参考资料 https://github.com/aieater/rocm_pytorch_informations ,有修改
在ubuntu 18.04 及ubuntu 20.04 测试通过
以下为安装pytorch 1.6 + rocm 3.3 (需要版本匹配)
1. 更新系统,安装必要的库
sudo apt update sudo apt -y dist-upgrade sudo apt install -y libnuma-dev
1 安装配套的rocm
配置rocm仓库 ,pytorch 1.6配全 rocm 3.3
wget -q -O - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
echo 'deb [arch=amd64] http://repo.radeon.com/rocm/apt/3.3/ xenial main' | sudo tee /etc/apt/sources.list.d/rocm.list
安装必要的包并重启
安装必要的rocm 驱动并重启,重要重要!!!
sudo apt update sudo apt install -y rocm-dkms rocm-libs hipcub miopen-hip rccl
sudo reboot
确认你的显卡是支持的型号:
# Make sure to recognized GPUs as file descriptor.
ls /dev/dri/
# >> card0 renderD128
rednerD128 即是Radeon VII
下载并安装 anaconda
www.anaconda.org 下载你自己系统对应的安装包并安装
https://www.anaconda.com/downloads
创建anaconda 环境
conda create -n pytorch python=3.7
注意根据下面的表格中的支持情况,选择相应的python版本,比如我安装的是pytorch 1.6, 则对应的是python 3.7.
准备环境路
echo 'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin/x86_64' | sudo tee -a /etc/profile.d/rocm.sh
sudo reboot
安装intel MKL 加速库
sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1
sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1
cd /tmp wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB sudo sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list' sudo apt-get update && sudo apt-get install intel-mkl-64bit-2018.2-046
支持的显卡:
GFX code |
架构 |
产品名 |
下载地址 |
GFX806 |
Polaris Series |
RX550/RX560/RX570/RX580/RX590 ... |
|
GFX900 |
Vega10 Series |
Vega64/Vega56/MI25/WX9100/FrontierEdition ... |
|
GFX906 |
Vega20 Series |
RadeonVII/MI50/MI60 ... |
使用预编译方法或源码安装方法,二选一,建议源码
一、使用预编译wheel 安装
即:
S表示可以正常安装
Python |
ROCm |
PyTorch |
GPU |
S |
3.7 |
3.3 |
1.6.0a |
GFX906 |
✓ |
3.7 |
3.3 |
1.6.0a |
GFX900 |
- |
3.5 |
2.9 |
1.3.0a |
GFX900 |
✓ |
3.5 |
2.9 |
1.3.0a |
GFX906 |
✓ |
3.7 |
2.9 |
1.3.0a |
GFX900 |
- |
3.7 |
2.9 |
1.3.0a |
GFX906 |
安装 : 以pytorch 1.6 radeon VII 显卡为例:
pip install http://install.aieater.com/libs/pytorch/rocm3.3/gfx906/torch-1.6.0a0-cp37-cp37m-linux_x86_64.whl torchvision
-----------------------------------------------------------------------------------------
二、源码安装方法
apt install libopenblas-dev cmake libnuma-dev autoconf build-essential ca-certificates curl libgoogle-glog-dev libhiredis-dev libleveldb-dev liblmdb-dev libopencv-dev libpthread-stubs0-dev libsnappy-dev libprotobuf-dev protobuf-compiler ninja_build
sudo apt install -y gcc cmake clang ccache llvm ocl-icd-opencl-dev sudo apt install -y rocrand rocblas miopen-hip miopengemm rocfft rocprim rocsparse rocm-cmake rocm-dev rocm-device-libs rocm-libs rccl hipcub rocthrust hipify-clang
获取 代码
git clone --recursive https://github.com/pytorch/pytorch.git cd pytorch
python tools/amd_build/build_amd.py
或 下载预处理好的: (强烈建议下载这个代码,其它的不一定能编译)
wget http://install.aieater.com/libs/pytorch/sources/pytorch1.6.0.tar.gz
tar zxf pytorch1.6.0.tar.gz
cd pytorch
安装必要的python库
pip install -r requirements.txt
设置环境变量:
export PYTORCH_ROCM_ARCH=gfx906
export HCC_AMDGPU_TARGET=gfx906
export LIBRARY_PATH="/opt/rocm/rccl/lib/"
#注意,必须安装 ninja-build 否则 会出错,确认方法
apt install ninja-build , 确认已经安装。
HIP_PLATFORM=hcc USE_NINJA=1 PYTORCH_ROCM_ARCH=gfx906 USE_CUDA=OFF USE_ROCM=1 USE_LMDB=1 RCCL_DIR=/opt/rocm/rccl USE_RCCL=1 BUILD_CAFFE2_OPS=0 BUILD_TEST=0 USE_OPENCV=1 MAX_JOBS=16 python setup.py install --user
造轮子:(保存于/root下)
python setup.py bdist_wheel -d /root/
安装torchtext
git clone GitHub - pytorch/text: Data loaders and abstractions for text and NLP
cd text
pip install ./
安装torchvision
git clone https://github.com/pytorch/vision
cd vision
USE_CUDA=OFF USE_ROCM=1 USE_LMDB=1 USE_OPENCV=1 MAX_JOBS=16 python setup.py install --user
cd ..