yum install -y autoconf automake libtool libjpeg libpng libtiff zlib libjpeg-devel libpng-devel libtiff-devel zlib-devel
选择最新版(1.76.0)安装,下载地址:http://www.leptonica.org/download.html
直接在线下载:
cd /usr/local/src
wget http://www.leptonica.org/source/leptonica-1.76.0.tar.gz
解压:
tar-zxvf leptonica-1.76.0.tar.gz
安装:
cd leptonica-1.76.0
./configure
make
make install
ldconfig
安装最新版(4.0.0-bate.3),下载地址:https://github.com/tesseract-ocr/tesseract/releases
直接在线下载:
wget https://github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.3.tar.gz
解压:
tar-zxvf tesseract-4.0.0-beta.3.tar.gz
安装:
cd tesseract-4.0.0-beta.3
./configure
提示错误:
Missing autoconf-archive. Check the build requirements
缺少autoconf-archive安装包,安装:
yum install autoconf-archive
执行:./autogen.sh
错误解决,执行:./configure
提示错误:
error: Leptonica 1.74 or higher is required. Try to install libleptonica-dev package
解决方案:
参考文档:https://blog.csdn.net/xjmxym/article/details/79040514
按照上述文档操作之后,执行:
./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/lib
make && make install
如果遇见如下问题:
./configure: line 4250: syntax error near unexpected token `-mavx,'
./configure: line 4250: `AX_CHECK_COMPILE_FLAG(-mavx, avx=true, avx=false)'
解决办法:
参考文档:
https://github.com/tesseract-ocr/tesseract/issues/777#issuecomment-288116640
tesseract --list-langs
提示错误:
Error opening data file /usr/local/share/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
解决方案:
github下载全套tessdata_fast并上传至/usr/local/share/文件夹下,将tessdata_fast改名为tessdata,执行命令:
/usr/local/bin/tesseract /usr/local/apache/htdocs/uploads/images/test.jpg /usr/local/apache/htdocs/uploads/images/test -l chi_sim
或:
tesseract /usr/local/apache/htdocs/uploads/images/test.jpg /usr/local/apache/htdocs/uploads/images/test -l chi_sim
都可以生成test.txt文件;
提示如下内容可忽略:
Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
Warning. Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 251
一路安装配置下来,最难的不是安装配置,而是出现问题如何解决,去哪里寻找解决问题的答案,找不到答案的时候该怎么办。
第4步确实困扰了我一天,百度、谷歌、github的issue都没能找到解决方案,就在我决定放弃的时候,我想试最后一把。
把GitHub下载的整个tessdata_fast文件夹替换掉/usr/local/share目录下的tessdata,并改名为tessdata,结果竟然成功了,临表涕零啊。