gcc tf_ops编译文件踩坑记录

报错1:

tf_nndistance_g.cu:3:10: fatal error: third_party/eigen3/unsupported/Eigen/CXX11/Tensor: No such file or directory
 #include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
g++: error: tf_nndistance_g.cu.o: No such file or directory

 解决方法:

在tf_nndistance的编译文件中第一行加入-I $TF/include

#!/bin/bash

CUDA=/home/lyl/cuda/cuda-10.2
TF=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

$CUDA/bin/nvcc tf_nndistance_g.cu -o tf_nndistance_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -I $TF/include
g++ -std=c++11 tf_nndistance.cpp tf_nndistance_g.cu.o -o tf_nndistance_so.so -shared -fPIC -I $TF/include -lcudart -L $CUDA/lib64 -O2 -I $TF/include/external/nsync/public -L $TF -ltensorflow_framework -D_GLIBCXX_USE_CXX11_ABI=0


报错2:

Traceback (most recent call last):
  File "main.py", line 16, in 
    import utils.model_loss as model_loss
  File "/home/lyl/pu/Flex-PU/utils/model_loss.py", line 6, in 
    from tf_ops.sampling.tf_sampling import gather_point, farthest_point_sample
  File "/home/lyl/pu/Flex-PU/tf_ops/sampling/tf_sampling.py", line 12, in 
    sampling_module=tf.load_op_library(os.path.join(BASE_DIR, 'tf_sampling_so.so'))
  File "/home/lyl/.conda/envs/pugcn/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /home/lyl/pu/Flex-PU/tf_ops/sampling/tf_sampling_so.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

gcc tf_ops编译文件踩坑记录_第1张图片

 解决方法:

在第二行末尾加入-D_GLIBCXX_USE_CXX11_ABI=0

参考链接:undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE · Issue #87 · sampepose/flownet2-tf · GitHub

报错3:No module named 'tensorflow'

在虚拟环境里修改完环境变量~/.bashrc文件之后,会自动退出到主环境(没安装tensorflow的环境),此时即使重新source activate 新环境,也还是在主环境下。

解决方法:关掉当前服务器会话,重新启动一个服务器连接,source activate 新环境 即可。

报错4:libcudart.so.10.2: cannot open shared object file: No such file or directory

gcc tf_ops编译文件踩坑记录_第2张图片

 解决方法:

cuda和tensorflow版本不匹配,换成cuda10.2版本的就好了。

参考链接:

libcudart.so.10.2: cannot open shared object file: No such file or directory_执道者的博客-CSDN博客

报错5:已经安装shlearn 还是报错 No module named 'sklearn'

解决方法:

先安装一下sklearn的依赖库:

​pip install numpy scipy matplotlib scikit-learn ​

参考链接:明明已经安装了’sklearn‘但是为什么还是出现ModuleNotFoundError: No module named ‘sklearn‘_雪喻的博客-CSDN博客

你可能感兴趣的:(python,深度学习,tensorflow,ubuntu)