Win10下CUDA/C++的混合编译——PCT中PointNet2模块

源码下载:

GitHub - Strawberry-Eat-Mango/PCT_Pytorch: Pytorch implementation of PCT: Point Cloud Transformer问题:

很难找到在Win10下的复杂混合编译案例。

解决方案:

依据执行python setup.py build命令后的错误提示信息(缺头文件、库文件、无法找到文件、无法找到cl等等),做下述修改。

1、编译环境

Win10,VS2017

参考

https://zhuanlan.zhihu.com/p/371279126

设置vc环境vcx64.bat ,set DISTUTILS_USE_SDK=1 。

安装ninja,引导cpp_extension.py进入unix编译环境,分别调用nvcc编译器和cl编译器。

否则将进入gcc编译器,感觉Win10环境下问题更多。

2、尝试修改cpp_extension.py

原始编译过程中出现错误:

cl /showIncludes -mdll -O -Wall -DMS_WIN64 '-IE:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\pointnet2_ops\_ext-src\include' '-ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include' '-ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\torch\csrc\api\include' '-ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\TH' '-ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\THC' '-ID:\ProgramTensor\NVIDIA\CUDA\v10.0\include' '-ID:\ProgramTensor\Anaconda3\envs\pytorch\include' '-ID:\ProgramTensor\Anaconda3\envs\pytorch\include' -c -c E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\pointnet2_ops\_ext-src\src\group_points.cpp /FoE:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\group_points.o -Zi -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
显然,其中包含头文件-I选项出现了错误(-I不能出现在表示路径的引号之内)

解决方案:
进入cpp_extension.cpp,定位_write_ninja_file_and_compile_objects函数(约1225行),其中cflags为cl编译选项,post_cflags为cl编译后置选项,cuda_cflags,cuda_post_cflags为nvcc编译的对应选项。

在调用_write_ninja_file函数之前,编写命令行字符串处理函数:

def formatcflags(cflags):
    temp = []
    for f in cflags:
        if f.count('-ccbin '):
            temp.append('-ccbin ')
            f = f.replace('-I-ccbin ', '')
            f = f[1: -1]
            f = '"' + f + '"'
        elif f.count('-I'):
            temp.append('-I')
            f = f.replace('-I', '')
            f = f[1: -1]
            f = '"' + f + '"'
        temp.append(f)
    temp = [f.replace('\\', '/') for f in temp]
    return temp

def formatcudapostflags(pflags):
    #去除-fPIC选项前后多余的单引号和双引号
    pflags = [f[6:-6] if f.count('-fPIC') else f for f in pflags]
    return pflags

调用以上函数,分别处理cflags、cuda_cflags和cuda_post_flags:

cflags = formatcflags(cflags)
cuda_cflags = formatcflags(cuda_cflags)
cuda_post_cflags = formatcudapostflags(cuda_post_cflags)

3、尝试修改setup.py

常常会出现缺少头文件、库文件的情况,因此需要对setup.py做附加信息处理,定义:

cmd1 = r'-ccbin D:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\bin\Hostx64\x64'
vcdir1 = r'C:/Program Files (x86)/Windows Kits/10/Include/10.0.17763.0/shared'
vcdir2 = r'C:/Program Files (x86)/Windows Kits/10/Include/10.0.17763.0/ucrt'
vcdir3 = r'D:/Program Files (x86)/Microsoft Visual Studio/2017/Professional/VC/Tools/MSVC/14.16.27023/include'

然后把这些信息粗暴的加入到include中:

include_dirs=[osp.join(this_dir, _ext_src_root, "include"), cmd1, vcdir1, vcdir2, vcdir3]

加入的上述信息,均会在前述字符串处理函数中加以处理,解析出 -I“目录” -ccbin“目录” 等Windows认可的编译命令形式。

4、链接

上述修改后,可以使用python setup.py build一次性完成9个文件(.cpp .cu)的编译,生成9个.o文件和一个.def文件,但是会出现g++链接错误。很显然上述修改只是粗暴地改变了编译选项(但是实现了多个文件一次性编译,不必分9次编译),对链接没有影响,所以系统仍然沿循g++ 、gcc链接命令。

既然Windows,还是要引入到link命令上去。

限于时间作者没有尝试如何修改引导编译、链接分别进入Windows正确的轨道。只好采用单独执行一次link命令的方法进行链接:

link -dll E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\ball_query.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\bindings.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\group_points.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\interpolate.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\sampling.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\ball_query_gpu.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\group_points_gpu.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\interpolate_gpu.o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\sampling_gpu.o /libpath:"D:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\lib\onecore\x64" /libpath:"D:\ProgramTensor\Anaconda3\libs" /libpath:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.17763.0\um\x64" /libpath:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.17763.0\ucrt\x64" -LIBPATH:D:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\lib -LIBPATH:D:\ProgramTensor\NVIDIA\CUDA\v10.0\lib/x64 -LIBPATH:D:\ProgramTensor\Anaconda3\envs\pytorch\libs -LIBPATH:D:\ProgramTensor\Anaconda3\envs\pytorch\PCbuild\amd64 c10.lib torch.lib torch_cpu.lib torch_python.lib cudart.lib c10_cuda.lib torch_cuda.lib python37.lib -OUT:E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\lib.win-amd64-3.7\pointnet2_ops\_ext.cp37-win_amd64.pyd

链接的结果:

在build目录下生成_ext.cp37-win_amd64.pyd(动态库) .exp .lib文件

5、执行

将build下的pointnet2_ops文件夹整体拷贝至与main.py同级,python编译时即可找到

pointnet2_ops下的pointnet2_utils,并引用下述动态库包

import pointnet2_ops._ext as _ext

6、附:单条编译命令

如前所述,也可以采用逐个文件编译、整体链接的方式,这样的好处是完全抛弃setup.py,不必改动cpp_extension.cpp,缺点是必须逐条编译,当然高手也可以写一个bat一次性执行。

其中bindings.cpp实际是整个extension的函数列表打包导出,编译后的包名为Pyinit_extensionname

[1/9]sampling_gpu.cu
[2/9]group_points_gpu.cu
[3/9]ball_query_gpu.cu
[4/9]interpolate_gpu.cu

[5/9]bindings.cpp
[5/9]group_points.cpp
[5/9]ball_query.cpp
[5/9]sampling.cpp
[5/9]interpolate.cpp

cu文件使用nvcc编译:

D:\ProgramTensor\NVIDIA\CUDA\v10.0\bin\nvcc      -ccbin "D:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\bin\Hostx64\x64" -IE:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\pointnet2_ops\_ext-src\include         -ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include         -ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\torch\csrc\api\include         -ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\TH         -ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\THC         -ID:\ProgramTensor\NVIDIA\CUDA\v10.0\include         -ID:\ProgramTensor\Anaconda3\envs\pytorch\include         -ID:\ProgramTensor\Anaconda3\envs\pytorch\include     -c -c E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\pointnet2_ops\_ext-src\src\sampling_gpu.cu -o E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\sampling_gpu.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options         "    "    -fPIC    "    "         -O3 -Xfatbin -compress-all -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_37,code=compute_37 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_62,code=sm_62 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_37,code=sm_37 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_50,code=sm_50 -std=c++14

cpp文件使用cl编译:

cl /showIncludes -Wall -DMS_WIN64 -I "C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\shared" -I"C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\ucrt" -I"D:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\include" -IE:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\pointnet2_ops\_ext-src\include -ID:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include -I"D:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\torch\csrc\api\include" -I"D:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\TH" -I"D:\ProgramTensor\Anaconda3\envs\pytorch\lib\site-packages\torch\include\THC" -I"D:\ProgramTensor\NVIDIA\CUDA\v10.0\include" -I"D:\ProgramTensor\Anaconda3\envs\pytorch\include" -I"D:\ProgramTensor\Anaconda3\envs\pytorch\include" -c -c E:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\pointnet2_ops\_ext-src\src\sampling.cpp /FoE:\ZTensor\programs\PCT_Pytorch-main\pointnet2_ops_lib\build\temp.win-amd64-3.7\Release\pointnet2_ops\_ext-src\src\sampling.o -Zi -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0

当然,也可以安装ubantu或linux系统,应该能够避免很多混合编译链接问题。

顺便地,源程序中下载H5格式的Modelnet40数据集,使用的wget、unzip、mv、rm等命令,常规下Windows是没有这些命令的:

#os.system('wget %s; unzip %s' % (www, zipfile))

可以下载安装wget等命令模拟Linux环境,当然也可以直接修改成Windows下已有的命令:

os.system('curl -k -O %s' % www) #wget
os.system('tar -xf %s' % zipfile) #unzip
os.system('move %s %s' % (zipfile[:-4], DATA_DIR)) #mv
os.system('del %s' % zipfile) #rm

成功配置VS+CUDA后,可以一次性编译,参见

Win10下CUDA/C++的混合编译——更新

申明:python菜鸟,不喜勿喷!!!

你可能感兴趣的:(pytorch,深度学习,transformer,c++)