darknet动态库集成到公司软件时报错:CUDNN_STATUS_BAD_PARAM

更多文章参考:自己动手实现darknet预测分类动态库

报错:

[5956]  MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

[5956]  MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

status=3:CUDNN_STATUS_BAD_PARAM

报错代码:

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

代码理解:

For the supported GPUs, the Tensor Core operations will be triggered for convolution functions only when cudnnSetConvolutionMathType() is called on the appropriate convolution descriptor by setting the mathType to CUDNN_TENSOR_OP_MATH or CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION.

对于支持的GPU,只有在对适当的卷积描述符调用cudnnSetConvolutionMathType()时,才会为卷积函数触发张量核心操作,方法是将mathType设置为CUDNN_Tensor_OP_MATH或CUDNN_Tensor_OP_MATH_ALLOW_CONVERSION。

3.180. cudnnSetConvolutionMathType()
cudnnStatus_t cudnnSetConvolutionMathType(
    cudnnConvolutionDescriptor_t    convDesc,
    cudnnMathType_t                 mathType)
This function allows the user to specify whether or not the use of tensor op is permitted in the library routines associated with a given convolution descriptor.

Returns
CUDNN_STATUS_SUCCESS
The math type was set successfully.

CUDNN_STATUS_BAD_PARAM
Either an invalid convolution descriptor was provided or an invalid math type was specified.

此函数允许用户指定,在与给定卷积描述符相关联的库例程中是否允许使用tensor op。

提供了无效的卷积描述符或指定了无效的数学类型时,返回CUDNN_STATUS_BAD_PARAM。

A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.
The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
Improved the heuristics for cudnnGet*Algorithm() functions.
  • A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.选择这种模式可以减少FP32张量的计算时间。
  • The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
  • 函数cudnnrnforwardinference()、cudnnrnforwardtraining()、cudnnrnbackwarddata()和cudnnrnbackwardweights()只有在设置了CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION时,执行FP32输入/输出的下转换。

Following issues and limitations exist in this release:

本版本中存在以下问题和限制:

  • When tensor cores are enabled in cuDNN 7.3.0, the wgrad calculations will perform an illegal memory access when K and C values are both non-integral multiples of 8. This will not likely produce incorrect results, but may corrupt other memory depending on the user buffer locations. This issue is present on Volta & Turing architectures.
  • 当在cuDNN 7.3.0中启用张量核时,当K和C值都是8的非整数倍时,wgrad计算将执行非法内存访问。这可能不会产生不正确的结果,但可能会损坏其他内存,具体取决于用户缓冲区的位置。这个问题出现在Volta和Turing架构上。
  • Using cudnnGetConvolution*_v7 routines with cudnnConvolutionDescriptor_t set to CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION leads to incorrect outputs. These incorrect outputs will consist only of CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases, instead of also returning the performance results for both DEFAULT_MATH and CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases.
  • 使用cudnnGetConvolution*uv7例程,并将cudnnConvolutionDescriptor设置为CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION将导致不正确的输出。这些不正确的输出将仅包含CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases,而不是同时返回DEFAULT_MATH和CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases的性能结果。

 

如果VS,C/C++-->预编译头--》去掉CUDNN后,运行不报错;

但是去掉CUDNN后,预测出现nan问题。

因为报错代码只是加速卷积运算,所以注释掉该代码问题解决。

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    //CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

可能错误原因:

1.其他设备在用GPU时会报错

2.参数不对,访问了无效数据

官网文档:https://docs.nvidia.com/deeplearning/sdk/cudnn-archived/cudnn_701/cudnn-user-guide/index.html

CUDNN_STATUS_BAD_PARAM

An incorrect value or parameter was passed to the function.

To correct: ensure that all the parameters being passed have valid values.

因为单独测试时不报错,跟公司软件一起测试也不报错,但是在公司软件内部调用时报错,所以参数应该没有问题,估计是其他设备在使用GPU报错

参考文献:

https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/

Caffe与cudnn 6.0 的兼容性问题 CUDNN_STATUS_BAD_PARAM

caffe报错:cudnn.hpp:86] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM 原因

Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM

你可能感兴趣的:(CUDA,darknet,深度学习)