正常官方推荐的exe安装,调用cv::dnn模块运行,超级慢,而且打印信息:
setUpNet DNN module was not built with CUDA backend; switching to CPU
即使加上加速代码,依然超级慢,跑512x512图像,分割网络需要1s,2080ti 7.5算力:
this->loc_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
this->loc_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
所以需要编译源码,开启cuda加速模块。
(1)win10,visual studio 2019 (试过2015编译失败),cuda10.2,cudnn8.0.3,opencv4.5.0, opencv-contrib-4.5.0.
(2)或者,visual studio 2019+cuda11.4+cudnn8.2.2(8.0.3)+opencv4.5.0+opencv-contrib4.5.0
(3) visual studio 2019+cuda11.1+cudnn8.0.3+opencv4.5.0+opencv-contrib4.5.0
下载opencv4.5.0
Releases - OpenCVhttps://opencv.org/releases/page/2/下载opencv-contrib4.5.0
https://github.com/opencv/opencv_contrib/archive/refs/tags/4.5.0.ziphttps://github.com/opencv/opencv_contrib/archive/refs/tags/4.5.0.zip下载.cache文件夹
opencv4.5.0对应的.cache文件夹-互联网文档类资源-CSDN下载里面的文件完整,如下所示:adedataffmpegippicvnvidia_optica更多下载资源、学习资料请访问CSDN下载频道.https://download.csdn.net/download/jizhidexiaoming/82456363
下载安装cmake 3.23
https://github.com/Kitware/CMake/releases/download/v3.23.0-rc2/cmake-3.23.0-rc2-windows-x86_64.msihttps://github.com/Kitware/CMake/releases/download/v3.23.0-rc2/cmake-3.23.0-rc2-windows-x86_64.msicmake配置源路径和目标路径
点击configure
等待几分钟。
(1)勾选CUDA;
(2)设置opencv_contrib路径
(3)勾选build_opencv_world
(4)去掉python,test、java加快后面的编译
再次点击configure
(5)cuda_arch_bin算力设置
算力查询:
CUDA GPU | NVIDIA Developerhttps://developer.nvidia.com/zh-cn/cuda-gpus
或者cuda_generation设置为auto
(6)勾选cuda_fast_math
(7) 去掉setupvars
点击configure,再点击generate
这时候在生成的文件夹下搜索opencv_world450d.lib是没有的,需要vs继续编译。
vs2019打开opencv.sln,进行编译
打开界面如下。
任务栏-》生成-》配置管理器,勾选INSTALL
点击生成解决方案。等待两小时。。。编译成功。
(1)2080ti的显卡算力是7.5,在2080ti编译的opencv,放到3090显卡算力8.6上跑报错:
OpenCV(4.5.0) Error: No CUDA support (OpenCV was not built to work with the selected device.
备忘下,原因还未知。。。继续第二步探索
(2)3090显卡上编译opencv,设置cuda_arch_bin = 7.5;8.6,vs2019编译时报错:
CMake Error at cuda_compile_1_generated_gpu_mat.cu.obj.Debug.cmake:22.
备忘下,原因还未知。。。继续第三步探索
(3)3090显卡上编译opencv,设置cuda_generation= auto,点击configure后,cuda_arch_bin会自动识别到当前值应该为8.6
继续探索
(4)编译3090编译opencv时报错(此时设置cuda_arch_bin=8.6):
nvcc fatal : Unsupported gpu architecture 'compute_86'
LINK : fatal error LNK1104: 无法打开文件“..\..\lib\Debug\opencv_world450d.lib”
根据以下链接建议升高cuda版本(当前是10.2,建议升到11.4)。
ubuntu - nvcc fatal : Unsupported gpu architecture 'compute_86' - Stack Overflowhttps://stackoverflow.com/questions/69865825/nvcc-fatal-unsupported-gpu-architecture-compute-86升级后,编译成功。
(5)基于第4步,编译成功后进行测试,问题:
initCUDABackend CUDA backend will fallback to the CPU implementation for
原因未知。待解决。。。
(6)cuda\vc140.pdb, has an obsolete format, delete it and recompile
成功编译完release版本后,紧接着编译debug报这个错误,原因可能是有release生成的文件冲突了,解决办法:删掉编译release生成的所有文件,重新编译,或者更彻底的办法是重新走一遍上面的流程,即用cmake重新生成opencv.sln,然后重新编译debug版本。