PaddlePaddle PreconditionNotMetError: The third-party dynamic library(libnccl.so) that Paddle depend

PaddlePaddle PreconditionNotMetError: The third-party dynamic library(libnccl.so) that Paddle depend

今天记录下运行paddle项目时的一个大坑:
我的项目时paddle的OCR项目,之前运行的好好的,可今天再次运行竟然异常报错了,内容如下:
PaddlePaddle PreconditionNotMetError: The third-party dynamic library(libnccl.so) that Paddle depend_第1张图片

paddle::framework::ParallelExecutor::ParallelExecutor(std::vector<paddle::platform::Place,std::allocator<paddle::platform:Place>> consts,std::vectorsstd::string,
std::allocator<std::string>> consts, std::string consts, paddle::framework::scope*,std::vector<paddle::framework::Scope*,std::allocator<paddle::framework::Scope*>> consts, paddle::framework::details::ExecutionStrategy consts,paddle::framework::details::Buildstrategy consts, paddle::framework::ir::Graph*)
-------- ------- 
Error Message Summary:
-----_---__-
PreconditionNotMetError: The third-party dynamic library(libnccl.so) that Paddle depends on is not configured correctly.(error code is libnccl.so: cannot open shared object file: No such file or directory) Suggestions:
1.Check if the third-party dynamic library(e.g. CUDA, CUDNN) is installed correctly and its version is matched with paddlepaddle you installed.
2. Configure third-party dynamic library environment variables as follows: - Linux: set LD LIBRARY PATH byexport LD LIBRARY PATH=...
-Windows:set PATH by `set PATH=xxx;at(/paddle/paddle/fluid/platform/dynload/dynamic loader.cc:194)
Your Paddle Fluid is installed successfully NLY for SINGLE GPU Or CPU! Let's start deep Learning with Paddle Fluid now
Process finished with exit code 0

一开始查看了好多博文,说是动态库没有配置好的问题,可跟着配置后还是异常报错;随后又想到是不是cuda与cudnn版本不匹配的问题,可我重新安装(注意本人的cuda是10.0,cudnn是7.6.5,要保持一致才行,否则运行paddl项目会出现另外一个异常报错)tensorflow1.13.1可以正常调用GPU,但paddl依旧出现截图中的问题,然后去了github竟然搜到的解决方案:
PaddlePaddle PreconditionNotMetError: The third-party dynamic library(libnccl.so) that Paddle depend_第2张图片原因是NCCL库安装有问题!!!
以下是nccl的安装流程:
下载:
英伟达官网下载:https://developer.nvidia.com/nccl/nccl-legacy-downloads
本人的cuda版本是10.0,所以选择对应的版本-----下载链接:https://download.csdn.net/download/guoqingru0311/86396273
在这里插入图片描述
点击后选择下载到本地安装:
PaddlePaddle PreconditionNotMetError: The third-party dynamic library(libnccl.so) that Paddle depend_第3张图片
下载到本地是压缩文件:nccl_2.6.4-1+cuda10.0_ppc64le.txz

tar -xvf nccl_2.6.4-1+cuda10.0_ppc64le.txz

解压完毕后得到文件如下:
在这里插入图片描述
将include文件夹下的文件都复制到cuda文件夹下对应的inlcude中去;同理将lib文件夹下所有文件都复制到cuda文件夹下对应的lib64中去;

sudo cp include/* /usr/local/cuda-10.0/include
sudo cp lib/* /usr/local/cuda-10.0/lib64

经过确认,我们安装的是cuda-10.0, 然后进入/usr/local/cuda-10.0/lib64

cd /usr/local/cuda-10.0/lib64
ls

在这里插入图片描述
需要删除原有文件,然后重新生成libnccl.so 和 libnccl.so.2这两个文件.

# cd /usr/local/cuda-10.0/lib64
# 删除原有文件
sudo rm libnccl.so libnccl.so.2
# 创建软连接
sudo ln -s libnccl.so.2.6.4 libnccl.so.2
sudo ln -s libnccl.so.2 libnccl.so
# 查看软链接是否创建成功
ls

以上基本完成!!!

你可能感兴趣的:(python,paddlepaddle,paddle,c++)