这款很好用的工具项目地址为:https://github.com/breezedeus/cnstd
目前基于PyTorch开发
首先需要安装相关依赖:
pip install cnstd
pip install cnocr
我们采用以下图片进行测试,命名为lining.png:
运行样例测试程序:
from cnstd import CnStd
from cnocr import CnOcr
std = CnStd()
cn_ocr = CnOcr()
box_infos = std.detect('../examples/lining.png', resized_shape=(768, 1024))
print(len(box_infos['detected_texts']))
for box_info in box_infos['detected_texts']:
cropped_img = box_info['cropped_img']
ocr_res = cn_ocr.ocr_for_single_line(cropped_img)
print('ocr result: %s' % str(ocr_res))
模型需要先对图片进行缩放,之后才能进行预测,其中,std.detect 的 resized_shape 可以自行设置,但必须是32的倍数,此处设置为跟图片自身分辨率接近的值:(768, 1024))运行程序后,报错为:
Traceback (most recent call last):
File "D:/dev/cnstd-master/tests/test_example.py", line 10, in
from cnstd import CnStd
File "D:\dev\cnstd-master\cnstd\__init__.py", line 20, in
from .cn_std import CnStd
File "D:\dev\cnstd-master\cnstd\cn_std.py", line 32, in
from .model import gen_model
File "D:\dev\cnstd-master\cnstd\model\__init__.py", line 22, in
from .dbnet import gen_dbnet, DBNet
File "D:\dev\cnstd-master\cnstd\model\dbnet.py", line 31, in
from .base import DBPostProcessor, _DBNet
File "D:\dev\cnstd-master\cnstd\model\base.py", line 23, in
from shapely.geometry import Polygon
File "D:\ProgramData\Anaconda3\envs\cnocr\lib\site-packages\shapely\geometry\__init__.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "D:\ProgramData\Anaconda3\envs\cnocr\lib\site-packages\shapely\geometry\base.py", line 19, in
from shapely.coords import CoordinateSequence
File "D:\ProgramData\Anaconda3\envs\cnocr\lib\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "D:\ProgramData\Anaconda3\envs\cnocr\lib\site-packages\shapely\geos.py", line 154, in
_lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "D:\ProgramData\Anaconda3\envs\cnocr\lib\ctypes\__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。
Process finished with exit code 1
这是因为我们的库里缺少缺少 shapely 工具,输入下列指令进行安装:
conda install -c conda-forge shapely
安装成功后,再次运行,可以得到以下的结果:
ocr result: (['阿', '国'], 0.28529179096221924)
ocr result: (['彰', '华', '荟', '业'], 0.2765953242778778)
ocr result: (['伸', '展', '自', '由'], 0.7053608894348145)
ocr result: (['4', '1', '创', '鑫', 'Y', '平', 'Y'], 0.2013929933309555)
ocr result: (['D', 'N', 'N', '1', 'T'], 0.2209479957818985)
ocr result: (['甲', '目', '遥', '申', 'l', '兵', '煜', '冰', '孚'], 0.23983563482761383)
ocr result: (['S', 'P', '7', 'R', 'T', '5'], 0.5269754528999329)
ocr result: (['5', 'R', 'E', '5'], 0.17251747846603394)
可以看出识别的准确率不是很高,这是因为下列参数:
rotated_bbox: 是否支持旋转检测带角度的文本框;默认为 True由于本张图片主要是只检测水平文本框种的文本,为了减少模型计算量,本处将 rotated_bbox 设为 False,代码修改为:
from cnstd import CnStd
from cnocr import CnOcr
std = CnStd(rotated_bbox=False)
cn_ocr = CnOcr()
box_infos = std.detect('../examples/lining.png', resized_shape=(768, 1024))
print(len(box_infos['detected_texts']))
for box_info in box_infos['detected_texts']:
cropped_img = box_info['cropped_img']
ocr_res = cn_ocr.ocr_for_single_line(cropped_img)
print('ocr result: %s' % str(ocr_res))
此时运行得到的结果为:
ocr result: (['中', '国'], 0.2543279230594635)
ocr result: (['不', '受', '束', '缚'], 0.4928607642650604)
ocr result: (['伸', '展', '自', '中'], 0.4091264307498932)
ocr result: (['宽', '松', '版', '型', '设', '计'], 0.5022301077842712)
ocr result: (['L', 'I', 'F', 'N', 'I', 'N', 'G'], 0.22028349339962006)
ocr result: (['宽', '松', '舒', '适', '伸', '展', '自', '由'], 0.7746936082839966)
ocr result: (['G', 'P', 'R', 'T', '令'], 0.2853994071483612)
ocr result: (['R', '分'], 0.20073305070400238)
可以看出,识别准确率还是不错的,给这个模型点个赞!