cuda编译错误 ptxas fatal : Unresolved extern function xxxx

出现这个问题。找到了原因是,在global函数中调用了__device__函数,但是这两个文件不在同一个src文件里面。

http://stackoverflow.com/questions/31006581/cuda-device-unresolved-extern-function


The issue is that you defined a __device__ function in separate compilation unit from __global__that calls it. You need to either explicitely enable relocatable device code mode by adding -dc flag or move your definition to the same unit.

From nvcc documentation:

--device-c|-dc Compile each .c/.cc/.cpp/.cxx/.cu input file into an object file that contains relocatable device code. It is equivalent to --relocatable-device-code=true --compile.

See Separate Compilation and Linking of CUDA C++ Device Code for more information.



http://stackoverflow.com/questions/17188527/cuda-external-class-linkage-and-unresolved-extern-function-in-ptxas-file


因此解决的方式有2个。

第一是两个函数放到同一个cu文件中。

第二是在cu文件属性页面选项卡中 cuda c/c++->common->Generate Relocatable Device Code 选择-rdc=true。允许重定位device代码编译。或者在整个工程的cuda c/c++项中配置这个-rdc=true.

cuda编译错误 ptxas fatal : Unresolved extern function xxxx_第1张图片



解决问题。


其他参考

https://devtalk.nvidia.com/default/topic/524436/how-to-deal-with-ptxas-fatal-error-unresolved-extern-function-39-cudagetparameterbuffer-39-/

1) View -> Property Pages
2) Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
3) Configuration Properties -> CUDA C/C++ -> Code Generation -> compute_35,sm_35
4) Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib

你可能感兴趣的:(机器学习和GPU)