以下学习来源于 youtube
AI 葵老师的系列课程
为了方便后续学习我将它上传到了我的 BliBli 上,国内的同学可以点击访问。
github code
如果github打不开,可以用我们国内的 gitee。
pytorch
进行编程时,是会自动并行的,比如单个批次中每个输入都会进行同样的计算,并并行,以得到多个输出。但是,这仅限于计算过程相同的情况,即例子中的模型都是同一个模型,参数也一样。当计算过程不同时,我们无法利用传统的 python
程序并行。因此需要 C++
+ cuda
对其进行扩充,这也符合本人研究的并行场景,因此,选择 C++ 拓展的路线是正确的,恰好,本教程也是基于此场景作为例子(trilinear interpolation)的。C++
+ cuda
融合(fussion
)加速,减少计算。pytorch C++ cuda 的关系:
pytroch -> C++ (桥梁)-> cuda(并行,重点)
triliear interpolation 具体介绍可参考:https://blog.csdn.net/weixin_42546737/article/details/110850247
其实我没太看懂,不过说白了,就是一个正方体有八个顶点,以他们作为基准,用他们来定义方块内部的任意一点,定性的规律就是:该点与某一基准点的距离越短,那么该基准点的系数越高。
anaconda
搭建环境"/home/dell/anaconda3/envs/python36_deep_learning_mcmc/include/python3.6m",
"/home/dell/anaconda3/envs/python36_deep_learning_mcmc/lib/python3.6/site-packages/torch/include",
"/home/dell/anaconda3/envs/python36_deep_learning_mcmc/lib/python3.6/site-packages/torch/include/torch/csrc/api/include"
],
教程:
// 定义 C++ connect to cuda
#include "utils.h"
#include
torch::Tensor trilinear_interpolation(
torch::Tensor feats,
torch::Tensor points
){
CHECK_INPUT(feats);
CHECK_INPUT(points);
return trilinear_fw_cu(feats, points);
}
//定义 C++ connect to python
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m){
m.def("trilinear_interpolation", &trilinear_interpolation);
}
#include
torch::Tensor trilinear_interpolation(
torch::Tensor feats,
torch::Tensor points
){
return feats;
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m){
m.def("trilinear", &trilinear_interpolation,"test");
}
新建:setup.py
教程:
import glob
import os.path as osp
from setuptools import setup
from torch.utils.cpp_extension import CUDAExtension, BuildExtension
ROOT_DIR = osp.dirname(osp.abspath(__file__))
include_dirs = [osp.join(ROOT_DIR, "include")]
sources = glob.glob('*.cpp')+glob.glob('*.cu')
setup(
name='cppcuda_tutorial',
version='1.0',
author='kwea123',
author_email='[email protected]',
description='cppcuda_tutorial',
long_description='cppcuda_tutorial',
ext_modules=[
CUDAExtension(
name='cppcuda_tutorial',
sources=sources,
include_dirs=include_dirs,
extra_compile_args={'cxx': ['-O2'],
'nvcc': ['-O2']}
)
],
cmdclass={
'build_ext': BuildExtension
}
)
from setuptools import setup,Extension
from torch.utils.cpp_extension import CppExtension,BuildExtension
setup(
name="cppcuda_tutorial",
version="1.0",
author="guifengye",
author_email="[email protected]",
description="cpp pytorch example",
long_description="cpp pytorch example",
ext_modules=[
CppExtension(
name="cppcuda_tutorial",
sources=['interpolation.cpp']
)
],
cmdclass={
'build_ext': BuildExtension
}
)
安装 C++ 扩展
ctrl + `
pip install .
(python36_deep_learning_mcmc) dell@dell-Precision-7920-Tower:~/pytorch_c++_cuda/example_1$ pip install .
Processing /home/dell/pytorch_c++_cuda/example_1
Preparing metadata (setup.py) ... done
Building wheels for collected packages: cppcuda-tutorial
Building wheel for cppcuda-tutorial (setup.py) ... done
Created wheel for cppcuda-tutorial: filename=cppcuda_tutorial-1.0-cp36-cp36m-linux_x86_64.whl size=2396295 sha256=1f552e7008febae29806ec521c630e15f80922c051149f6a5681964c8f8f2203
Stored in directory: /tmp/pip-ephem-wheel-cache-abyozr1_/wheels/c7/42/69/6d65223c40179208b30427d8c9811191045997975887fc8e13
Successfully built cppcuda-tutorial
Installing collected packages: cppcuda-tutorial
Attempting uninstall: cppcuda-tutorial
Found existing installation: cppcuda-tutorial 1.0
Uninstalling cppcuda-tutorial-1.0:
Successfully uninstalled cppcuda-tutorial-1.0
Successfully installed cppcuda-tutorial-1.0
有两种编译方案
import torch
#from torch.utils.cpp_extension import load
#cppcuda = load(name="test", sources=['interpolation.cpp'], verbose=False,extra_cflags=["-O2"])
import cppcuda_tutorial
feats = torch.ones(2)
point = torch.ones(2)
out = cppcuda_tutorial.trilinear(feats, point)
print(out)
import torch
from torch.utils.cpp_extension import load
cppcuda = load(name="test", sources=['interpolation.cpp'], verbose=False,extra_cflags=["-O2"])
feats = torch.ones(2)
point = torch.ones(2)
out = cppcuda.trilinear(feats, point)
print(out)
ImportError: dynamic module does not define module export function (PyInit_cppcuda_tutorial)
应该还是命名的原因,所以不同的地方尽量还是命名不一样。