Ubuntu 20.04
Cuda 10.0 (后面又换了Cuda 10.2)
python 3.7.10
pytorch 1.6.0
gcc 7.5.0
基本按照参考博客配置即可。
注意两点:
1)上面链接里面使用的是pytorch1.3.0,但我在后面安装spconv
python setup.py bdist_wheel
的时候,出现了错误:
[ 8%] Built target spconv_nms
[ 12%] Building CXX object src/utils/CMakeFiles/spconv_utils.dir/all.cc.o
Consolidate compiler generated dependencies of target cuhash
make[2]: *** 没有规则可制作目标“/usr/local/cuda/lib64/libcudart.so”,由“src/cuhash/CMakeFiles/cuhash.dir/cmake_device_link.o” 需求。 停止。
make[1]: *** [CMakeFiles/Makefile2:154:src/cuhash/CMakeFiles/cuhash.dir/all] 错误 2
make[1]: *** 正在等待未完成的任务....
In file included from /home/cora/PointCloud/spconv/src/utils/all.cc:15:0:
/home/cora/PointCloud/spconv/include/spconv/box_iou.h:21:10: fatal error: boost/geometry.hpp: 没有那个文件或目录
#include
^~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [src/utils/CMakeFiles/spconv_utils.dir/build.make:76:src/utils/CMakeFiles/spconv_utils.dir/all.cc.o] 错误 1
make[1]: *** [CMakeFiles/Makefile2:232:src/utils/CMakeFiles/spconv_utils.dir/all] 错误 2
make: *** [Makefile:136:all] 错误 2
Traceback (most recent call last):
File "setup.py", line 108, in
zip_safe=False,
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/site-packages/setuptools/__init__.py", line 163, in setup
return distutils.core.setup(**attrs)
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 299, in run
self.run_command('build')
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "setup.py", line 48, in run
self.build_extension(ext)
File "setup.py", line 92, in build_extension
subprocess.check_call(['cmake', '--build', '.'] + build_args, cwd=self.build_temp)
File "/home/cora/anaconda3/envs/pvrcnn/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '--', '-j4']' returned non-zero exit status 2.
可以看到第一个问题出在
make[2]: *** 没有规则可制作目标“/usr/local/cuda/lib64/libcudart.so”,由“src/cuhash/CMakeFiles/cuhash.dir/cmake_device_link.o” 需求。 停止。
经网上查询,有的说把pytorch换成1.6.0版本即可。因此:
pip uninstall torch
pip uninstall torchvision
但是官网上并没有对应cuda10.0的pytorch1.6.0,后来发现安装比cuda10.0高的版本也是可以的,因此选择cuda10.1对应的pytorch进行安装:
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
2)libboost配置文件的安装
第二个问题出在
/home/cora/PointCloud/spconv/include/spconv/box_iou.h:21:10: fatal error: boost/geometry.hpp: 没有那个文件或目录
经检查是libboost配置文件没装上或者安装失败的原因,因此重新安装即可:
sudo apt-get install libboost-filesystem-dev
sudo apt-get install libboost-dev
接下来就可以愉快地训练PV-RCNN了!
………………分割线…………………………
在我按上面装完之后,一切都看起来很正常,于是开心地准备跑一下demo,然后就遇到了如下错误:Cuda kernel failed:invalid device function 段错误
绝望ing……
因为我之前跑pointRCNN的时候也有遇到过这个问题,很多博客说是算力设置的不对,但是我根本找不到在哪设置算力……最后忘记看了哪篇博客,把python3.6的虚拟环境换成了3.7,重新配了一遍环境就好了,反正总的来说就是某些版本不匹配的问题,我感觉是和pytorch相关。
然后我就想到了之前装的pytorch是cuda10.1版本的,但是我本机的cuda是10.0,虽然这么装上之后,测试的torch也是能用的,但说不定哪里不匹配呢(哭了)
所以我又装了cuda10.2,可以参考ubuntu上如何安装多cuda版本。
接下来就把原来的虚拟环境删了,我为了省心把git上下载的工程文件也删了,从头再来,但是我后面再下载的时候就老是git clone 失败(此处又是一把辛酸泪)……所以应该也可以不删工程文件,但要记得把build文件夹删掉,还有安装spconv时生成的.whl文件,然后再安装的时候重新build一遍就行。
这次我装的是pytorch1.6.0,对应cuda10.2,命令如下:
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch
然后就是按之前的流程都再走一遍。
quickdemo:
python demo.py --cfg_file cfgs/kitti_models/pv_rcnn.yaml --ckpt ../pv_rcnn_8369.pth --data_path ../data/kitti/training/velodyne/000002.bin
遇到问题:
ImportError: Could not import backend for traitsui. Make sure you
have a suitable UI toolkit like PyQt/PySide or wxPython
installed.
我是python3.7的环境,按如下命令安装pyside2:
pip install pyside2
另一个问题:
qt.qpa.plugin: Could not load the Qt platform plugin “xcb” in “” even though it was found.
不知道为啥,但解决方法是:
sudo apt-get install libxcb-xinerama0
这次是真的可以愉快地跑PVRCNN了!!!