centos7 系统 yum 安装 tesseract,并 pip 安装 python3 的 tesserocr

安装epel 源:

yum -y install epel-release

安装tesseract:

yum -y install tesseract

执行检查tesseract 支持的语言:

tesseract --list-langs

List of available languages (1):
eng

发现目前只支持英语,如需要安装更多语言包,可通过 git 获取:

git clone https://github.com/tesseract-ocr/tessdata.git
mv tessdata/* /usr/share/tesseract/tessdata

pip 安装 tesserocr:

pip3 install tesserocr

发现安装 tesserocr 报错,错误信息如下:

Installing collected packages: tesserocr

  Running setup.py install for tesserocr ... error

    Complete output from command /usr/local/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-i48iarbe/tesserocr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-p27b42h9/install-record.txt --single-version-externally-managed --compile:

    pkg-config failed to find tesseract/lept libraries: b"Package tesseract was not found in the pkg-config search path.\nPerhaps you should add the directory containing `tesseract.pc'\nto the PKG_CONFIG_PATH environment variable\nNo package 'tesseract' found\n"

    Supporting tesseract v3.04.00

    Building with configs: {'libraries': ['tesseract', 'lept'], 'cython_compile_time_env': {'TESSERACT_VERSION': 197632}}

    /usr/local/python3/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'

      warnings.warn(msg)

    running install

    running build

    running build_ext

    building 'tesserocr' extension

    creating build

    creating build/temp.linux-x86_64-3.6

    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/python3/include/python3.6m -c tesserocr.cpp -o build/temp.linux-x86_64-3.6/tesserocr.o

    tesserocr.cpp:597:34: fatal error: leptonica/allheaders.h: No such file or directory

     #include "leptonica/allheaders.h"

                                      ^

    compilation terminated.

    error: command 'gcc' failed with exit status 1

   

    ----------------------------------------

Command "/usr/local/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-i48iarbe/tesserocr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-p27b42h9/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-i48iarbe/tesserocr/

解决方法,安装一下 tesseract-devel 库:

yum -y install tesseract-devel 

再重新pip安装tesserocr:

pip3 install tesserocr

没报错,完成!

你可能感兴趣的:(python)