Anaconda 安装tensorflow后安装warp-ctc

warp-ctc: https://github.com/baidu-research/warp-ctc

Anaconda 安装tensorflow参照官网即可。

环境:Ubuntu 14.04,tf 1.4, GPU

1:找一个位置,clone warp-ctc

git clone https://github.com/baidu-research/warp-ctc.git
cd warp-ctc

2:创建build目录(编译后文件所在目录)

mkdir build
cd build

3:编译

cmake ../
make

4:进入tensorflow_binding

cd tensorflow_binding

5:设置WARP_CTC_PATH(包含libwarpctc.so路径,编译后再build文件下可以找到),TENSORFLOW_SRC_PATH(tensorflow所在路径),CUDA_HOME(cuda根目录)(根据实际情况设置自己的path).在这里我用的是修改bashrc的方法。隔离不同的用户。如果是在机房什么的,临时使用,可以直接执行export命令,其它省略,只在当前shell有效不影响别人使用。

vim ~/.bashrc
export TENSORFLOW_SRC_PATH="$HOME/tools/anaconda3/lib/python3.6/site-packages:$TENSORFLOW_SRC_PATH"
export CUDA_HOME="/usr/local/cuda:$CUDA_HOME"
export WARP_CTC_PATH=$HOME/warp-ctc/build:$WARP_CTC_PATH
source ~/.bashrc

6:由于环境变量中的path是根据:分割的,setup.py中获得path后没有将其分割。因此会出现找不到的情况。对于这种情况有两种解决办法。建议第二种。

1)修改环境变量书写方式

export TENSORFLOW_SRC_PATH="$HOME/tools/anaconda3/lib/python3.6/site-packages$TENSORFLOW_SRC_PATH"
export CUDA_HOME="/usr/local/cuda$CUDA_HOME"
export WARP_CTC_PATH=$HOME/warp-ctc/build$WARP_CTC_PATH

2)修改setup.py文件中代码行

    ① 将如下代码

warp_ctc_path = "../build"
if "WARP_CTC_PATH" in os.environ:
    warp_ctc_path = os.environ["WARP_CTC_PATH"]
if not os.path.exists(os.path.join(warp_ctc_path, "libwarpctc"+lib_ext)):
    print(("Could not find libwarpctc.so in {}.\n"
           "Build warp-ctc and set WARP_CTC_PATH to the location of"
           " libwarpctc.so (default is '../build')").format(warp_ctc_path),
          file=sys.stderr)
    sys.exit(1)
         修改为


#warp_ctc_path = "../build"
if "WARP_CTC_PATH" in os.environ:
    warp_ctc_path_list = os.environ["WARP_CTC_PATH"].strip().split(':')
    for warp_ctc_path in warp_ctc_path_list:
        if not os.path.exists(os.path.join(warp_ctc_path, "libwarpctc"+lib_ext)):
            # print(os.path.join(warp_ctc_path, "libwarpctc"+lib_ext))
            # print(os.path.exists(os.path.join(warp_ctc_path, "libwarpctc"+lib_ext)))
            print(("Could not find libwarpctc.so in {}.\n"
                   "Build warp-ctc and set WARP_CTC_PATH to the location of"
                   " libwarpctc.so (default is '../build')").format(warp_ctc_path),
                  file=sys.stderr)
            sys.exit(1)
elif "WARP_CTC_PATH" not in os.environ:
    print("Could not find libwarpctc.so ,Build warp-ctc and set WARP_CTC_PATH to the location of libwarpctc.so to environ")
    sys.exit(1)
     含义就是以,冒号为分割符,查找每个路径下是否有 libwarpctc.so文件
   ② 将如下代码
tf_src_dir = os.environ["TENSORFLOW_SRC_PATH"]

      修改为

tf_src_dir = os.environ["TENSORFLOW_SRC_PATH"].strip().split(':')[0]

     我这里的含义是我只找cuda_home路径中的第一个路径,所以请保证第一个路径是你需要的cuda版本。也可以根据不同的tf版本选择不同的cuda版本,我有点嫌麻烦,就没改,先用着吧。哈哈

③ 将如下代码

if (enable_gpu):
    extra_compile_args += ['-DWARPCTC_ENABLE_GPU']
    include_dirs += [os.path.join(os.environ["CUDA_HOME"], 'include')]

    # mimic tensorflow cuda include setup so that their include command work
    if not os.path.exists(os.path.join(root_path, "include")):
        os.mkdir(os.path.join(root_path, "include"))

    cuda_inc_path = os.path.join(root_path, "include/cuda")
    if not os.path.exists(cuda_inc_path) or os.readlink(cuda_inc_path) != os.environ["CUDA_HOME"]:
        if os.path.exists(cuda_inc_path):
            os.remove(cuda_inc_path)
        os.symlink(os.environ["CUDA_HOME"], cuda_inc_path)
    include_dirs += [os.path.join(root_path, 'include')]

修改为

if (enable_gpu):
    extra_compile_args += ['-DWARPCTC_ENABLE_GPU']
    include_dirs += [os.path.join(os.environ["CUDA_HOME"].strip().split(':')[0], 'include')]
    # mimic tensorflow cuda include setup so that their include command work
    if not os.path.exists(os.path.join(root_path, "include")):
        os.mkdir(os.path.join(root_path, "include"))

    cuda_inc_path = os.path.join(root_path, "include/cuda")
    if not os.path.exists(cuda_inc_path) or os.readlink(cuda_inc_path) != os.environ["CUDA_HOME"]:
        if os.path.exists(cuda_inc_path):
            os.remove(cuda_inc_path)
        os.symlink(os.environ["CUDA_HOME"].strip().split(':')[0], cuda_inc_path)
    include_dirs += [os.path.join(root_path, 'include')]
理由同上啊。


7:执行setup.py

python setup.py install

 在执行过程中会有如下错误:

fatal error: nsync_cv.h: No such file or directory
错误,就是找不到这个文件。追踪到出错文件
保密/tools/anaconda3/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h

#include "nsync_cv.h"

#include "nsync_mu.h"

修改为

#include "external/nsync/public/nsync_cv.h"
#include "external/nsync/public/nsync_mu.h"

保存重新执行即可。(至于为什么这样改,你去你的tensorflow/include/)下看看就知道了。

8.测试 

python tests/test_ctc*.py
报错:

s/warpctc_tensorflow-0.1-py3.6-linux-x86_64.egg/warpctc_tensorflow/kernels.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

或者

warpctc_tensorflow/kernels.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN10tensorflow7strings9StrAppendEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS0_8AlphaNumESA_

这里受到官网的启发

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/docs_src/extend/adding_an_op.md

也就是如下这段话。官网tf除了master以外的branch,在syscofig.py文件中只有get_include()方法与get_lib()方法。但是get_compile_flags(),get_link_flags()实际返回值还是get_include()与get_lib(),只是多加了一些命令。

$ python
>>> import tensorflow as tf
>>> tf.sysconfig.get_include()
'/usr/local/lib/python2.7/site-packages/tensorflow/include'
>>> tf.sysconfig.get_lib()
'/usr/local/lib/python2.7/site-packages/tensorflow'
Assuming you have g++ installed, here is the sequence of commands you can use to compile your op into a dynamic library.

TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') )
TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') )
g++ -std=c++11 -shared zero_out.cc -o zero_out.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2

修改方法如下:

修改setup.py文件

1.对应于tf.sysconfig.get_compile_flags(),添加最后一行代码

extra_compile_args = ['-std=c++11', '-fPIC']
# current tensorflow code triggers return type errors, silence those for now
extra_compile_args += ['-Wno-return-type']
extra_compile_args += ['-D_GLIBCXX_USE_CXX11_ABI=0']
2.对应于 tf.sysconfig.get_link_flags(),添加第二句代码,以及添加到ext变量中
lib_srcs = ['src/ctc_op_kernel.cc', 'src/warpctc_op.cc']
extra_link_args=['-L' + tf.sysconfig.get_lib(), '-ltensorflow_framework']

ext = setuptools.Extension('warpctc_tensorflow.kernels',
                           sources = lib_srcs,
                           language = 'c++',
                           include_dirs = include_dirs,
                           library_dirs = [warp_ctc_path],
                           runtime_library_dirs = [os.path.realpath(warp_ctc_path)],
                           libraries = ['warpctc'],
                           extra_compile_args = extra_compile_args,
                           extra_link_args=extra_link_args)


修改保存。(将/。。/warp-ctc/tensorflow_binding/build/ lib.linux-x86_64-3.6/warpctc_tensorflow/ kernels.cpython-36m-x86_64-linux-gnu.so此文件删除)。不删除修改的没有作用,听话哈。


重新执行即可。

你可能感兴趣的:(Deeplearning)