1、点击链接,下载tesseract,然后安装,并配置环境变量,在用户变量->path中添加D:\Program\Tesseract-OCR,在系统变量->path中添加D:\Program\Tesseract-OCR;
2、在命令行输入tesseract test.jpg result
,识别test.jpg,并将结果保存在result.txt之中;
3、python调用OCR API,首先pip install pytesseract,然后运行如下代码,实现OCR识别
import pytesseract
from PIL import Image
text = pytesseract.image_to_string(Image.open('test.jpg'))
print(text)
python调用OCR API,首先pip install paddleocr,pip install paddlepaddle,然后运行如下代码,实现OCR识别。识别结果包含了字符位置、字符内容、置信度。
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True,use_gpu=False)
result = ocr.ocr('C:\\Users\\86177\\Desktop\\ocr\\1.jpg')
print(result)
[[[[537.0, 208.0], [717.0, 208.0], [717.0, 249.0], [537.0, 249.0]], (‘URT7141’, 0.92392796)],
[[[570.0, 259.0], [687.0, 261.0], [686.0, 314.0], [569.0, 312.0]], (‘GB/T’, 0.91816545)],
[[[560.0, 313.0], [699.0, 319.0], [697.0, 375.0], [558.0, 369.0]], (‘23901’, 0.9946063)]]