安装spconv库遇到的疑难杂症和解决方法

spconv安装步骤:

 $ sudo apt-get install libboost-all-dev
 $ git clone https://github.com/traveller59/spconv.git --recursive
 $ cd spconv && git checkout 7342772
 $ python setup.py bdist_wheel
 $ cd ./dist && pip install *

第一步:执行:sudo apt-get install libboost-all-dev

提示:

/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8 is not a symbolic link

解决:
终端输入:

sudo ldconfig -v
(cp2) twilight@ROG: ~/project/CenterPoint/apex$ sudo ldconfig -v
/sbin/ldconfig.real: Can't stat /usr/local/lib/x86_64-linux-gnu: No such file or directory
/sbin/ldconfig.real: Path `/lib/x86_64-linux-gnu' given more than once
/sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu' given more than once
/usr/local/cuda-11.0/targets/x86_64-linux/lib:
	libnppim.so.11 -> libnppim.so.11.1.0.218
	libcublasLt.so.11 -> libcublasLt.so.11.1.0.229
	libnvjpeg.so.11 -> libnvjpeg.so.11.1.0.218
	libnvblas.so.11 -> libnvblas.so.11.1.0.229
	libcurand.so.10 -> libcurand.so.10.2.1.218
/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8 is not a symbolic link

	libcudnn_ops_infer.so.8 -> libcudnn_ops_infer.so.8.0.5
	libnppist.so.11 -> libnppist.so.11.1.0.218
/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8 is not a symbolic link

	libcudnn_adv_train.so.8 -> libcudnn_adv_train.so.8.0.5
	libcusolver.so.10 -> libcusolver.so.10.5.0.218
	libnppial.so.11 -> libnppial.so.11.1.0.218
	libnppidei.so.11 -> libnppidei.so.11.1.0.218
	libnvrtc.so.11.0 -> libnvrtc.so.11.0.194
	libnppicc.so.11 -> libnppicc.so.11.1.0.218
	libaccinj64.so.11.0 -> libaccinj64.so.11.0.194
	libnppisu.so.11 -> libnppisu.so.11.1.0.218
	libnppig.so.11 -> libnppig.so.11.1.0.218
	libnppitc.so.11 -> libnppitc.so.11.1.0.218
	libcublas.so.11 -> libcublas.so.11.1.0.229
	libcufftw.so.10 -> libcufftw.so.10.2.0.218
	libcuinj64.so.11.0 -> libcuinj64.so.11.0.194
	libcudart.so.11.0 -> libcudart.so.11.0.194
/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8 is not a symbolic link

	libcudnn_adv_infer.so.8 -> libcudnn_adv_infer.so.8.0.5
/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8 is not a symbolic link

	libcudnn_cnn_train.so.8 -> libcudnn_cnn_train.so.8.0.5
/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8 is not a symbolic link

	libcudnn_cnn_infer.so.8 -> libcudnn_cnn_infer.so.8.0.5
	libnpps.so.11 -> libnpps.so.11.1.0.218
	libnppif.so.11 -> libnppif.so.11.1.0.218
	libnvToolsExt.so.1 -> libnvToolsExt.so.1.0.0
	libnvrtc-builtins.so.11.0 -> libnvrtc-builtins.so.11.0.194
/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8 is not a symbolic link

	libcudnn.so.8 -> libcudnn.so.8.0.5
	libcusolverMg.so.10 -> libcusolverMg.so.10.5.0.218
/sbin/ldconfig.real: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8 is not a symbolic link

	libcudnn_ops_train.so.8 -> libcudnn_ops_train.so.8.0.5
	libcufft.so.10 -> libcufft.so.10.2.0.218
	libnppc.so.11 -> libnppc.so.11.1.0.218
	libcusparse.so.11 -> libcusparse.so.11.1.0.218
	libOpenCL.so.1 -> libOpenCL.so.1.0.0

找到这一行错误:libcudnn.so.8 -> libcudnn.so.8.0.5
是这个链接错误,然后在终端输入:

sudo ln -sf /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8.0.5 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8

##第二步: 执行:git clone https://github.com/traveller59/spconv.git --recursive

问题:第三方库下载不下来
解决:分别用git clone下载第三方库,放到指定文件夹中。
其中 pybind11 会提前下载好一个空文件,要先删除,然后分别执行以下命令:

git clone https://github.com.cnpmjs.org/NVIDIA/cutlass
git clone https://github.com.cnpmjs.org/boostorg/mp11
git clone https://github.com.cnpmjs.org/pybind/pybind11.git

第四步:执行:python setup.py bdist_wheel

先进入到cuDNN的安装文件夹

 cd cudnn

确保把cudnn中的cudnn_version.h复制到了cuda目录(新版本cudnn的版本信息包含在cudnn_version.h而不是cudnn.h,安装cudnn时把所有cudnn开头的都复制过去)

sudo cp cuda/include/cudnn* /usr/local/cuda/include

找到cuda.cmake文件

locate  cuda.cmake

我这里cuda.cmake的目录是: /home/twilight/.conda/envs/cp2/lib/python3.7/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake

 code /home/twilight/.conda/envs/cp2/lib/python3.7/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake

替换 : file(READ ${CUDNN_INCLUDE_PATH}/cudnn.h CUDNN_HEADER_CONTENTS)
为 : file(READ ${CUDNN_INCLUDE_PATH}/cudnn_version.h CUDNN_HEADER_CONTENTS)

然后再执行:

python setup.py bdist_wheel

报错:

/home/twilight/project/CenterPoint0/spconv/src/spconv/all.cc:20:91: error: no matching function for call to ‘torch::jit::RegisterOperators::RegisterOperators(const char [28], <unresolved overloaded function type>)’
     torch::jit::RegisterOperators("spconv::get_indice_pairs_2d", &spconv::getIndicePair<2>)

修改:

code /home/twilight/project/CenterPoint0/spconv/src/spconv/all.cc

参考:
替换:torch::jit::RegisterOperators
为:torch::RegisterOperators

参考链接:
– Found cuDNN: v? (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so) CMake Er

你可能感兴趣的:(Centerpoint论文复现,深度学习)