编译pytorch源码

编译pytorch源码

clone pytorch源码:

(base) xueruini@nico2:~/onion_rain/pytorch/code$ git clone --recursive https://github.com/pytorch/pytorch
Cloning into 'pytorch'...
remote: Enumerating objects: 37, done.
remote: Counting objects: 100% (37/37), done.
...
Resolving deltas: 100% (151/151), done.
Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5'
Submodule path 'third_party/zstd': checked out 'aec56a52fbab207fc639a1937d1e708a282edca8'
(base) xueruini@nico2:~/onion_rain/pytorch/code$ cd pytorch
(base) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ ls
android  aten.bzl    binaries     c10     CITATION  CMakeLists.txt      CODEOWNERS       docker      docker.Makefile  ios      Makefile  mypy.ini  README.md         scripts   submodules  third_party  torch       version.txt
aten     benchmarks  BUILD.bazel  caffe2  cmake     CODE_OF_CONDUCT.md  CONTRIBUTING.md  Dockerfile  docs             LICENSE  modules   NOTICE    requirements.txt  setup.py  test        tools        ubsan.supp  WORKSPACE

切换到release版本

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ git checkout v1.5.1
Checking out files: 100% (3139/3139), done.
Note: checking out 'v1.5.1'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 3c31d73c87 [ONNX] Fix pow op export [1.5.1] (#39791)

更新一些第三方库

(base) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ git submodule sync
(base) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ git submodule update --init --recursive

新建虚拟环境

(base) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ conda create -n pytorch
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/xueruini/anaconda3/envs/pytorch



Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate pytorch
#
# To deactivate an active environment, use
#
#     $ conda deactivate

(base) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ conda activate pytorch
(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$

安装依赖

通用依赖

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi
Collecting package metadata (current_repodata.json): done
Solving environment: done
...
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

查一下cuda版本

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

安装cuda依赖

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ conda install -c pytorch magma-cuda102
Collecting package metadata (current_repodata.json): done
Solving environment: done
...
Verifying transaction: done
Executing transaction: done

安装cudatoolkit

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ conda install cudatoolkit
Collecting package metadata (current_repodata.json): done
Solving environment: done
...
Verifying transaction: done
Executing transaction: done

不装cudatoolkit,编译完import torch时会有如下报错:

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ python
Python 3.8.3 (default, Jul  2 2020, 16:21:59)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "", line 1, in <module>
  File "/home/xueruini/onion_rain/pytorch/code/pytorch/torch/__init__.py", line 1
35, in <module>
    _load_global_deps()
  File "/home/xueruini/onion_rain/pytorch/code/pytorch/torch/__init__.py", line 9
3, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/home/xueruini/anaconda3/envs/pytorch/lib/python3.8/ctypes/__init__.py",
 line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcublasLt.so.10: cannot open shared object file: No such file or dire

编译

这里由于我是要对pytorch源码debug,所以DEBUG参数置为1

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ DEBUG=1 python setup.py install
Building wheel torch-1.5.0a0+3c31d73
-- Building version 1.5.0a0+3c31d73
cmake -GNinja -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=/home/xueruini/onion_rain/pytorch/code/pytorch/torch -DCMAKE_PREFIX_PATH=/home/xueruini/anaconda3/envs/pytorch -DNUMPY_INCLUDE_DIR=/home/xueruini/anaconda3/envs/pytorch/lib/python3.8/site-packages/numpy/core/include -DPYTHON_EXECUTABLE=/home/xueruini/anaconda3/envs/pytorch/bin/python -DPYTHON_INCLUDE_DIR=/home/xueruini/anaconda3/envs/pytorch/include/python3.8 -DPYTHON_LIBRARY=/home/xueruini/anaconda3/envs/pytorch/lib/libpython3.8.so.1.0 -DTORCH_BUILD_VERSION=1.5.0a0+3c31d73 -DUSE_NUMPY=True /home/xueruini/onion_rain/pytorch/code/pytorch
...
Copying torch.egg-info to /home/xueruini/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch-1.5.0a0+3c31d73-py3.8.egg-info
running install_scripts
Installing convert-caffe2-to-onnx script to /home/xueruini/anaconda3/envs/pytorch/bin
Installing convert-onnx-to-caffe2 script to /home/xueruini/anaconda3/envs/pytorch/bin

验证

首先要退出pytorch目录

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ cd ..
(pytorch) xueruini@nico2:~/onion_rain/pytorch/code$ python
Python 3.8.3 (default, Jul  2 2020, 16:21:59)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.5.0a0+3c31d73'
>>>

不退出会出现如下报错:

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ python
Python 3.8.3 (default, Jul  2 2020, 16:21:59)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "", line 1, in <module>
  File "/home/xueruini/onion_rain/pytorch/code/pytorch/torch/__init__.py", line 1
36, in <module>
    from torch._C import *
ModuleNotFoundError: No module named 'torch._C'
>>>

卸载

(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ pip uninstall torch
Found existing installation: torch 1.7.0a0+6095808
Uninstalling torch-1.7.0a0+6095808:
  Would remove:
    /home/xueruini/anaconda3/envs/pytorch/bin/convert-caffe2-to-onnx
    /home/xueruini/anaconda3/envs/pytorch/bin/convert-onnx-to-caffe2
    /home/xueruini/anaconda3/envs/pytorch/lib/python3.8/site-packages/caffe2
    /home/xueruini/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch
    /home/xueruini/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch-1.7.0a0+6095808-py3.8.egg-info
Proceed (y/n)? Y
  Successfully uninstalled torch-1.7.0a0+6095808
(pytorch) xueruini@nico2:~/onion_rain/pytorch/code/pytorch$ python setup.py clean
Building wheel torch-1.7.0a0+6095808
running clean

其他

用vscode f5进行debug时,import torch报错,但是终端命令行跑没问题:
编译pytorch源码_第1张图片
解决办法:

conda install openmpi

好奇怪,如果是缺库那终端也不应该能运行才对呀

你可能感兴趣的:(pytorch)