项目场景:paddlepaddle FatalError Segmentation fault is detected by the operating system

项目场景:paddlepaddle FatalError: Segmentation fault is detected by the operating system.

paddlepaddle cpu运行infer.py正常 gpu运行infer.py报错


# 问题描述:

环境

paddlepaddle-gpu 2.1.0.post101

python 3.8.5

cuda 10.1

cudnn 8.0.5

C++ Traceback (most recent call last):
--------------------------------------
0   paddle::framework::SignalHandle(char const*, int)
1   paddle::platform::GetCurrentTraceBackString[abi:cxx11]()

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1623290314 (unix time) try "date -d @1623290314" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 23335 (TID 0x7f2ee0a14700) from PID 0 ***]

Segmentation fault (core dumped)

单独执行infer运行正常,放到项目中报错


原因分析:

1.首先打开infer.py日志

找到PaddleDetection/deploy/python/infer.py

注释config.disable_glog_info()

2.再次运行

W0610 09:58:33.832181 23452 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 10.1, Runtime API Version: 10.1
W0610 09:58:33.833010 23452 device_context.cc:422] device: 0, cuDNN Version: 8.0.


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::framework::SignalHandle(char const*, int)
1   paddle::platform::GetCurrentTraceBackString[abi:cxx11]()

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1623290314 (unix time) try "date -d @1623290314" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 23335 (TID 0x7f2ee0a14700) from PID 0 ***]

Segmentation fault (core dumped)

cudnn版本不兼容,装7.6.5


解决方案:

1.去官网下载cudnn

https://developer.nvidia.com/rdp/cudnn-archive

项目场景:paddlepaddle FatalError Segmentation fault is detected by the operating system_第1张图片

下载这三个,根据cuda和服务器版本下载

2.安装

#依次安装
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb

#官方说法:To verify that cuDNN is installed and is running properly, compile the mnistCUDNN sample located in the /usr/src/cudnn_samples_v8 directory in the debian file.
#0. Copy the cuDNN sample to a writable path.

cp -r /usr/src/cudnn_samples_v7/ $HOME
#Go to the writable path.
cd  ~/cudnn_samples_v7/mnistCUDNN

#2. Compile the mnistCUDNN sample.
#编译文件。
sudo make clean 
sudo make
3. Run the mnistCUDNN sample.
运行样例程序。
sudo ./mnistCUDNN
4. If cuDNN is properly installed and running on your Linux system, you will see a message similar to the following:
如果成功运行,会显示下列信息:

项目场景:paddlepaddle FatalError Segmentation fault is detected by the operating system_第2张图片

#查看cudnn版本
cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2

再次运行 正常


clude/cudnn.h | grep CUDNN_MAJOR -A 2


再次运行 正常


你可能感兴趣的:(深度学习,python,paddlepaddle)