手把手教你,一个案例学会用Matlab App Designer设计文字识别工具(附源码)

一、前言

有时候在读电子文档的过程中,往往会遇到图片形式的文本,想要复制下来,记个笔记甚是不便,需要对照着打字输入,活生生被逼成键盘侠啊......

image

被逼无奈,何不自己造个轮子,开发一款自己专属的文字识别工具呢,于是我们找到了Matlab App Designer。

玩过 Matlab 的朋友们都知道,构建图形用户界面,Matlab提供了两种工具,一是用guide构建,俗称GUI,在未来版本中会移除;二是用App Designer,俗称App,这是官方推荐的,也是以后主流的框架。

今天我们就通过一个简单案例来介绍如何利用App设计一个图片文字识别工具。

搭建的方式主要有两种:

  1. App设计器:灵活、方便、简单,现代化方法;

  2. 基于uifigure的编程方式:灵活、重构方便,适合构建复杂、大型的图形用户界面,原始社会方法。

这里我们就以编程方式进行创建。

二、预备

1. API接口

文字识别涉及到光学字符识别(Optical Character Recognition,OCR)技术,如果我们自己造这种底层的轮子,要有高精度的识别率,那估计累得够呛。

幸运的是市场上已经有成熟的工具了,如百度智能云、阿里云、科大讯飞等均提供了API接口,只需借过来用就完事。这里主要以百度智能云提供的文字识别API为例。

免费申请文字识别功能后,在控制台可以查看到API KeySecret Key,由这两个参数可以获得access_token,它是调用API接口的必需参数(如下图红色方框所示)。

image

通过查看文字识别的技术文档,我们可以得到通用文字识别(标准版)的请求接口,如下:

HTTP 方法POST

请求URLhttps://aip.baidubce.com/rest/2.0/ocr/v1/general_basic

URL参数:属性名:access_token,值:通过API KeySecret Key获取的access_token,参考“Access Token获取”

Header:属性名:Content-Type,值:application/x-www-form-urlencoded

请求参数:属性名:image,值:图像数据,base64编码后进行urlencode,要求base64编码和urlencode后大小不超过4M,最短边至少15px,最长边最大4096px,支持jpg/jpeg/png/bmp格式

返回参数:属性名:words_result,值:识别结果数组

关于具体的HTTP请求过程接下来会细聊。

2. 图像的Base64编码

Base64是网络上最常见的用于传输8Bit字节码的编码方式之一,它是包括小写字母a-z、大写字母A-Z、数字0-9、符号+/共64个字符的字符集,等号=用来作为后缀用途。任何符号都可以转换成这个字符集中的字符,该转换过程就叫做Base64编码。Base64编码具有不可读性,需要解码后才能阅读。

许多编程语言都提供了现成的Base64编码库函数,Matlab也不例外,大家不妨 help matlab.net.base64encode查看细节。

下面提供三种Matlab中的实现方式:

  • Java类---org.apache.commons.codec.binary.Base64 和 matlab.net.base64encode

 function base64string = img2base64(fileName)
%IMG2BASE64 Coding an image to base64 file
% INPUTS:
% fileName string, an image file name
% OUTPUTS:
% base64string string, the input image's base64 code
% USAGE:
% >>base64string = img2base64('1.jpg')
% >>base64string = 'xxx'
%
try
fid = fopen(fileName, 'rb');
bytes = fread(fid);
fclose(fid);
% -------------------------------------------
% First method
% -------------------------------------------
encoder = org.apache.commons.codec.binary.Base64;
base64string = char(encoder.encode(bytes))';
% -------------------------------------------
% Second method
% -------------------------------------------
% base64string = matlab.net.base64encode(bytes);
catch
disp('The file does not exist!');
base64string = '';
end % end try
end % end function

  • 使用Python base64模块

Matlab中可以直接使用Python,那Python中提供的模块base64就可以直接使用了,源代码如下:

 function base64string = img2base64_(fileName)
%IMG2BASE64 Coding an image to base64 file
% INPUTS:
% fileName string, an image file name
% OUTPUTS:
% base64string string, the input image's base64 code
% USAGE:
% >>base64string = img2base64('1.jpg')
% >>base64string = 'xxx'
%
try
f = py.open(fileName, 'rb');
bytes = f.read();
f.close();
temp = char(py.base64.b64encode(bytes));
temp = regexp(temp, '(?<=b'').+(?='')', 'match');
base64string = temp{1};
catch
disp('The file does not exist!');
base64string = '';
end % end try
end % end function

我们可以对如下所示的同一张图片(500 x 500)进行base64编码,比较一下编码速度:

image

结果:'/9j/4AAQSkZ...AAAAAAD/9k='

  • Java类---org.apache.commons.codec.binary.Base64 ⏲ 0.000783 秒
  • matlab.net.base64encode ⏲ 0.017589 秒
  • Python base64模块 ⏲ 0.000709 秒

可以发现使用Java类和Python base64模块的方法,速度相当,而使用matlab.net.base64encode速度要慢20多倍,但编码一张大小为500 x 500的图像耗时0.02秒左右,其速度是非常之快了。

综合一下,我们推荐使用org.apache.commons.codec.binary.Base64类进行base64编码。

3. 屏幕截图

识别扫描版pdf文档、视频教程等中的文字时,我们需要对待识别文字所在区域截个图,保存为图像再进行后续识别操作。要实现上述过程,首先需要对屏幕进行截图,Matlab通过借助java.awt.Robot这个Java类来实现,截屏源代码如下所示:

 function imgData = screenSnipping
%screenSnipping Capturel full-screen to an image
% Output:
% imgData, uint8, image data.
% Source code from: https://www.mathworks.com/support/search.html/answers/362358-how-do-i-take-a-screenshot-using-matlab.html?fq=asset_type_name:answer%20category:matlab/audio-and-video&page=1
% Modified: Qingpinwangzi
% Date: Apr 14, 2021.

% Take screen capture
robo = java.awt.Robot;
tk = java.awt.Toolkit.getDefaultToolkit();
rectSize = java.awt.Rectangle(tk.getScreenSize());
cap = robo.createScreenCapture(rectSize);

% Convert to an RGB image
rgb = typecast(cap.getRGB(0, 0, cap.getWidth, cap.getHeight, [], 0, cap.getWidth), 'uint8');
imgData = zeros(cap.getHeight, cap.getWidth, 3, 'uint8');
imgData(:, :, 1) = reshape(rgb(3:4:end), cap.getWidth, [])';
imgData(:, :, 2) = reshape(rgb(2:4:end), cap.getWidth, [])';
imgData(:, :, 3) = reshape(rgb(1:4:end), cap.getWidth, [])';
end

4. 调用百度API识别文字

上述第1节中我们提到过,access_token是调用API接口的必需参数。通过阅读技术文档得知,需要API KeySecret Key进行http请求就可以获得,核心代码如下:

 url = ['https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=', apiKey, '&client_secret=', secretKey];
res = webread(url, options);
access_token = res.access_token;

有了access_token我们就可以调用文字识别API进行文字识别了,这里再分享下识别文字的源代码:

 function result = getWordsByBaiduOCR(fileName, apiKey, secretKey, accessToken, apiURL, outType)
%GETWORDSBYBAIDUOCR return recognition words
% INPUTS:
% fileName string, an image file name
% apiKey string, the API Key of the application
% secretKey string, The Secret Key of the application
% accessToken string, default is '', get the Access Token by API
% Key and Secret Key.
% apiURL string, such as:
% 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate'
% 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic'
% 'https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic'
% outType, 'MultiLine|SingleLine'
% OUTPUTS:
% result []|struct
% USAGE:
% >>result = getWordsByBaiduOCR(fileName, apiKey, secretKey, accessToken, apiURL)
% Date: Mar 18, 2021.
% Author: 清贫王子
%

options = weboptions('RequestMethod', 'post');
if isempty(outType)
outType = 'MultiLine';
end

if isempty(accessToken)
url = ['https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=', apiKey, '&client_secret=', secretKey];
res = webread(url, options);
access_token = res.access_token;
else
access_token = accessToken;
end % end if

url = [apiURL, '?access_token=', access_token];
options.HeaderFields = { 'Content-Type', 'application/x-www-form-urlencoded'};
imgBase64String = img2base64(fileName);
if isempty(imgBase64String)
result = '';
return
end % end if
res = webwrite(url, 'image', imgBase64String, options);
wordsRsult = res.words_result;

data.ocrResultChar = '';

if strcmp(outType, 'SingleLine')
for ii = 1 : size(wordsRsult, 1)
data.ocrResultChar = [data.ocrResultChar, wordsRsult(ii,1).words];
end % end for

elseif strcmp(outType, 'MultiLine')
for ii = 1 : size(wordsRsult, 1)
data.ocrResultChar{ii} = wordsRsult(ii,1).words;
end % end for
end
result = data.ocrResultChar;
end % end function

简单测试下这个函数,输入下面所示的图片,我们进行图片(截图地址:https://ww2.mathworks.cn/products/matlab/app-designer.html)中的文字识别。

image

 >> result =

1×7 cell 数组

列 1 至 4

{'App设计工具帮助您…'} {'开发专业背景。您只…'} {'面(GUI)设计布局,…'} {'编程。'}

列 5 至 7

{'要共享App,您可以使…'} {' MATLAB Compile…'} {'桌面App或 Web App'}

result{1}

ans =

'App设计工具帮助您创建专业的App,同时并不要求软件'

识别结果中共有7个cell,代表识别了图片中的7行文字,即1个cell对应1行识别的文字,如result{1}的结果。

三、工具搭建

以基于uifigure的编程方式创建APP,我们推荐面向对象(OOP)方法编程,简单起见,这里主要封装一个类来实现所需的功能。当然更标准的做法是利用MVC等设计模式将界面和逻辑分离,能达到对扩展开放,对修改封闭的软件设计原则。

1. 功能需求

我们的功能需求非常简单,主要有以下两个功能:

  • 识别已经存在的图像中的文字

  • 识别扫描版pdf文档、视频教程等中的文字

实现第1个功能,我们只需要加载图像,然后调用识别函数进行识别,将识别结果显示到文本区域就可以了;而实现第2个功能,首先需要屏幕截图,选取待识别文字所在的区域,存储为图像,后续处理和实现第1个功能的一样。

根据上述描述,我们需要的控件有:加载图像按钮,截图按钮,图像显示器,识别结果显示文本域。另外,需要一个清理按钮,用于清除显示的图像和识别结果;还需要一个设置按钮,用于配置API KeySecret Key

便于叙述,我们先展示下最终设计的结果,如下图所示:

image
image
文字识别工具主界面 设置界面

在设置界面中,需要两个标签和两个文本框,两外需要两个按钮。据此,我们需要的控件都清楚了,接下来让我们一起来创建他们吧!

2. 实现细节

主要封装一个类来实现所需的功能,我们给这个类起个名:ReadWords,这个类需要继承matlab.apps.AppBase,它的属性就是界面中的所有控件,那么这个类看上去应该是这样的:

 classdef ReadWords < matlab.apps.AppBase
%%
properties
UIFig matlab.ui.Figure

ContainerForMain matlab.ui.container.GridLayout
ThisTB matlab.ui.container.Toolbar
SnippingToolBtn matlab.ui.container.toolbar.PushTool
ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
SetupToolBtn matlab.ui.container.toolbar.PushTool
CleanToolBtn matlab.ui.container.toolbar.PushTool
ImgShow matlab.ui.control.Image
WordsShowTA matlab.ui.control.TextArea

ContainerForSetup matlab.ui.container.GridLayout
APIKeyText matlab.ui.control.EditField
SecrectKeyText matlab.ui.control.EditField
ResetBtn matlab.ui.control.Button
SaveBtn matlab.ui.control.Button
end % end properties

%%
properties(Hidden, Dependent)
APIKeyVal
SecrectKeyVal
end % end properties

%%
properties(Access = protected)
HasSetup = false
end % end properties

end % end classdef

下面说明下一些重要的属性

公有属性:

  • UIFig 必须是matlab.ui.Figure类的属性,通过uifigure构造,这是整个工具的主窗口

  • ContainerForMain 必须是matlab.ui.container.GridLayout类的属性,通过uigridlayout构造,这是主窗口的布局容器

  • ThisTB 必须是matlab.ui.container.Toolbar类的属性,通过uitoolbar构造,这是工具栏的容器,用于放置SnippingToolBtnImgLoadToolBtnSetupToolBtnCleanToolBtn这4个工具按钮

  • ImgShow 必须是matlab.ui.control.Image类的属性,通过uiimage构造,用于显示加载或者截图后的图像

  • WordsShowTA 必须是matlab.ui.control.TextArea类的属性,通过uitextarea构造,用于显示文字识别结果

  • ContainerForSetup 设置界面中的网格容器

  • APIKeyTextSecrectKeyText 主要用于输入APIKeySecrectKey

  • ResetBtnSaveBtn两个按钮分别用来实现重置和保存APIKeySecrectKey

从属、隐藏属性:

  • APIKeyVal 用于接收APIKeyText中输入的APIKey的值

  • SecrectKeyVal 用于接收SecrectKeyText中输入的SecrectKey的值

受保护属性:

  • HasSetup 用于标识是否配置了APIKeySecrectKey,默认为false

至此,我们设置好了所有的属性,然后进行构造方法、析构方法以及类方法的编写。

加上构造方法、析构方法以及从属属性APIKeyValSecrectKeyValget方法的代码后看上去是这样的:

 classdef ReadWords < matlab.apps.AppBase
%%
properties
UIFig matlab.ui.Figure

ContainerForMain matlab.ui.container.GridLayout
ThisTB matlab.ui.container.Toolbar
SnippingToolBtn matlab.ui.container.toolbar.PushTool
ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
SetupToolBtn matlab.ui.container.toolbar.PushTool
CleanToolBtn matlab.ui.container.toolbar.PushTool
ImgShow matlab.ui.control.Image
WordsShowTA matlab.ui.control.TextArea

ContainerForSetup matlab.ui.container.GridLayout
APIKeyText matlab.ui.control.EditField
SecrectKeyText matlab.ui.control.EditField
ResetBtn matlab.ui.control.Button
SaveBtn matlab.ui.control.Button
end % end properties

%%
properties(Hidden, Dependent)
APIKeyVal
SecrectKeyVal
end % end properties

%%
properties(Access = protected)
HasSetup = false
end % end properties

%%
methods
% --------------------------------------
% % Constructor
% --------------------------------------
function app = ReadWords
% Create UIFigure and components
app.buildApp();
% Register the app with App Designer
registerApp(app, app.UIFig)

if nargout == 0
clear app
end
end % end Constructor

% --------------------------------------
% % Destructor
% --------------------------------------
% Code that executes before app deletion
function delete(app)
% Delete UIFigure when app is deleted
delete(app.UIFig)
end % end Constructor

% --------------------------------------
% % Get/Set methods
% --------------------------------------
% get.APIKeyVal
function apiKeyVal = get.APIKeyVal(app)
apiKeyVal = app.APIKeyText.Value;
end

% get.SecrectKeyVal
function secrectKeyVal = get.SecrectKeyVal(app)
secrectKeyVal = app.SecrectKeyText.Value;
end
end % end methods
end % end classdef

析构方法(Destructor)的写法是固定的,构造方法中的registerApp(app, app.UIFig)也是固定的,另外的buildApp()方法就用来创建界面、注册各个控件。

我们将后续的方法都创建为私有方法,添加了buildApp()方法后的整个ReadWords类是下面这样的:

 classdef ReadWords < matlab.apps.AppBase
%%
properties
UIFig matlab.ui.Figure

ContainerForMain matlab.ui.container.GridLayout
ThisTB matlab.ui.container.Toolbar
SnippingToolBtn matlab.ui.container.toolbar.PushTool
ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
SetupToolBtn matlab.ui.container.toolbar.PushTool
CleanToolBtn matlab.ui.container.toolbar.PushTool
ImgShow matlab.ui.control.Image
WordsShowTA matlab.ui.control.TextArea

ContainerForSetup matlab.ui.container.GridLayout
APIKeyText matlab.ui.control.EditField
SecrectKeyText matlab.ui.control.EditField
ResetBtn matlab.ui.control.Button
SaveBtn matlab.ui.control.Button
end % end properties

%%
properties(Hidden, Dependent)
APIKeyVal
SecrectKeyVal
end % end properties

%%
properties(Access = protected)
HasSetup = false
end % end properties

%%
methods
% --------------------------------------
% % Constructor
% --------------------------------------
function app = ReadWords
% Create UIFigure and components
app.buildApp();
% Register the app with App Designer
registerApp(app, app.UIFig)

if nargout == 0
clear app
end
end % end Constructor

% --------------------------------------
% % Destructor
% --------------------------------------
% Code that executes before app deletion
function delete(app)
% Delete UIFigure when app is deleted
delete(app.UIFig)
end % end Constructor

% --------------------------------------
% % Get/Set methods
% --------------------------------------
% get.APIKeyVal
function apiKeyVal = get.APIKeyVal(app)
apiKeyVal = app.APIKeyText.Value;
end

% get.SecrectKeyVal
function secrectKeyVal = get.SecrectKeyVal(app)
secrectKeyVal = app.SecrectKeyText.Value;
end
end % end methods

%%
methods(Access = private)
% buildApp
function buildApp(app)
%
% --------------------------------------
% % Main Figure
% --------------------------------------
app.UIFig = uifigure();
app.UIFig.Icon = 'icons/img2text.png';
app.UIFig.Name = 'ReadWords';
app.UIFig.Visible = 'off';
app.UIFig.Position = [app.UIFig.Position(1), app.UIFig.Position(2), 745, 420];
app.UIFig.AutoResizeChildren = 'on';
app.UIFig.Units = 'Normalized';
app.setAutoResize(app.UIFig, true);

% --------------------------------------
% % Toolbar
% --------------------------------------
app.ThisTB = uitoolbar(app.UIFig);
% SetupToolBtn
app.SetupToolBtn = uipushtool(app.ThisTB);
app.SetupToolBtn.Icon = 'icons/setup.png';
app.SetupToolBtn.Tooltip = 'Setup';

% SnippingToolBtn
app.SnippingToolBtn = uipushtool(app.ThisTB);
app.SnippingToolBtn.Icon = 'icons/snip.png';
app.SnippingToolBtn.Tooltip = 'Screenshot';

% ImgLoadToolBtn
app.ImgLoadToolBtn = uipushtool(app.ThisTB);
app.ImgLoadToolBtn.Icon = 'icons/load.png';
app.ImgLoadToolBtn.Tooltip = 'Load image';

% CleanToolBtn
app.CleanToolBtn = uipushtool(app.ThisTB);
app.CleanToolBtn.Icon = 'icons/clean.png';
app.CleanToolBtn.Tooltip = 'Clean';

% --------------------------------------
% % ContainerForMain
% --------------------------------------
app.ContainerForMain = uigridlayout(app.UIFig, [1, 2]);

% ContainerForMain
imgShowPanel = uipanel(app.ContainerForMain, 'Title', 'Original');
resultShowPanel = uipanel(app.ContainerForMain, 'Title', 'Result');
% ImgShow
imgShowPanelLay = uigridlayout(imgShowPanel, [1, 1]);
imgShowPanelLay.RowSpacing = 0;
imgShowPanelLay.ColumnSpacing = 0;
app.ImgShow = uiimage(imgShowPanelLay);
% WordsShowTA
resultShowPanelLay = uigridlayout(resultShowPanel, [1, 1]);
resultShowPanelLay.RowSpacing = 0;
resultShowPanelLay.ColumnSpacing = 0;
app.WordsShowTA = uitextarea(resultShowPanelLay);
app.WordsShowTA.FontSize = 22;

% --------------------------------------
% % ContainerForSetup
% --------------------------------------
app.ContainerForSetup = uigridlayout(app.UIFig, [4, 3]);
app.ContainerForSetup.RowHeight = {22, 22, 22, '1x'};
app.ContainerForSetup.ColumnWidth = {'1x', '1x', '2.5x'};
app.ContainerForSetup.Visible = 'off';
apiKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'API Key');
apiKeyLabel.HorizontalAlignment = 'right';
apiKeyLabel.Layout.Row = 1;
apiKeyLabel.Layout.Column = 1;
% APIKeyText
app.APIKeyText = uieditfield(app.ContainerForSetup);
app.APIKeyText.Layout.Row = 1;
app.APIKeyText.Layout.Column = 2;
secrectKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'Secrect Key');
secrectKeyLabel.HorizontalAlignment = 'right';
secrectKeyLabel.Layout.Row = 2;
secrectKeyLabel.Layout.Column = 1;
% SecrectKeyText
app.SecrectKeyText = uieditfield(app.ContainerForSetup);
app.SecrectKeyText.Layout.Row = 2;
app.SecrectKeyText.Layout.Column = 2;
% ResetBtn
app.ResetBtn = uibutton(app.ContainerForSetup, 'Text', 'Reset');
app.ResetBtn.Layout.Row = 3;
app.ResetBtn.Layout.Column = 1;
% SaveBtn
app.SaveBtn = uibutton(app.ContainerForSetup, 'Text', 'Save');
app.SaveBtn.Layout.Row = 3;
app.SaveBtn.Layout.Column = 2;
% Set visibility for UIFig
movegui(app.UIFig, 'center');
app.UIFig.Visible = 'on';

% --------------------------------------
% % RunstartupFcn
% --------------------------------------
app.runStartupFcn(@startupFcn);

end % end buildApp
end % methods
end % end classdef

需要注意的是,工具栏按钮和窗口的图标来源于:https://www.easyicon.cc/。一些常见的图标素材都可以从中免费下载。我们已经将图标下载完毕,需要的朋友可以点击下方链接来下载:

链接:https://pan.baidu.com/s/11kIvt4SX-MhQ2ltEeC18ZA 提取码:5i3k

另外,app.runStartupFcn(@startupFcn);语句调用的是父类matlab.apps.AppBase的方法,我们将各个控件的注册任务放在startupFcn这个方法中完成。这里不妨先注释掉这个语句,直接运行ReadWords.m便可以显示出我们刚才在buildApp方法中构造的界面了,动图演示如下:

image

可以看到,我们在点击工具栏各个按钮时,没有反应,这是因为到目前为止我们还没有给各个控件注册回调方法,那接下来将会在startupFcn这个方法中完成各个控件的注册任务,代码如下:

 classdef ReadWords < matlab.apps.AppBase
%%
properties
UIFig matlab.ui.Figure

ContainerForMain matlab.ui.container.GridLayout
ThisTB matlab.ui.container.Toolbar
SnippingToolBtn matlab.ui.container.toolbar.PushTool
ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
SetupToolBtn matlab.ui.container.toolbar.PushTool
CleanToolBtn matlab.ui.container.toolbar.PushTool
ImgShow matlab.ui.control.Image
WordsShowTA matlab.ui.control.TextArea

ContainerForSetup matlab.ui.container.GridLayout
APIKeyText matlab.ui.control.EditField
SecrectKeyText matlab.ui.control.EditField
ResetBtn matlab.ui.control.Button
SaveBtn matlab.ui.control.Button
end % end properties

%%
properties(Hidden, Dependent)
APIKeyVal
SecrectKeyVal
end % end properties

%%
properties(Access = protected)
HasSetup = false
end % end properties

%%
methods
% --------------------------------------
% % Constructor
% --------------------------------------
function app = ReadWords
% Create UIFigure and components
app.buildApp();
% Register the app with App Designer
registerApp(app, app.UIFig)

if nargout == 0
clear app
end
end % end Constructor

% --------------------------------------
% % Destructor
% --------------------------------------
% Code that executes before app deletion
function delete(app)
% Delete UIFigure when app is deleted
delete(app.UIFig)
end % end Constructor

% --------------------------------------
% % Get/Set methods
% --------------------------------------
% get.APIKeyVal
function apiKeyVal = get.APIKeyVal(app)
apiKeyVal = app.APIKeyText.Value;
end

% get.SecrectKeyVal
function secrectKeyVal = get.SecrectKeyVal(app)
secrectKeyVal = app.SecrectKeyText.Value;
end
end % end methods

%%
methods(Access = private)
% buildApp
function buildApp(app)
%
% --------------------------------------
% % Main Figure
% --------------------------------------
app.UIFig = uifigure();
app.UIFig.Icon = 'icons/img2text.png';
app.UIFig.Name = 'ReadWords';
app.UIFig.Visible = 'off';
app.UIFig.Position = [app.UIFig.Position(1), app.UIFig.Position(2), 745, 420];
app.UIFig.AutoResizeChildren = 'on';
app.UIFig.Units = 'Normalized';
app.setAutoResize(app.UIFig, true);

% --------------------------------------
% % Toolbar
% --------------------------------------
app.ThisTB = uitoolbar(app.UIFig);
% SetupToolBtn
app.SetupToolBtn = uipushtool(app.ThisTB);
app.SetupToolBtn.Icon = 'icons/setup.png';
app.SetupToolBtn.Tooltip = 'Setup';

% SnippingToolBtn
app.SnippingToolBtn = uipushtool(app.ThisTB);
app.SnippingToolBtn.Icon = 'icons/snip.png';
app.SnippingToolBtn.Tooltip = 'Screenshot';

% ImgLoadToolBtn
app.ImgLoadToolBtn = uipushtool(app.ThisTB);
app.ImgLoadToolBtn.Icon = 'icons/load.png';
app.ImgLoadToolBtn.Tooltip = 'Load image';

% CleanToolBtn
app.CleanToolBtn = uipushtool(app.ThisTB);
app.CleanToolBtn.Icon = 'icons/clean.png';
app.CleanToolBtn.Tooltip = 'Clean';

% --------------------------------------
% % ContainerForMain
% --------------------------------------
app.ContainerForMain = uigridlayout(app.UIFig, [1, 2]);

% ContainerForMain
imgShowPanel = uipanel(app.ContainerForMain, 'Title', 'Original');
resultShowPanel = uipanel(app.ContainerForMain, 'Title', 'Result');
% ImgShow
imgShowPanelLay = uigridlayout(imgShowPanel, [1, 1]);
imgShowPanelLay.RowSpacing = 0;
imgShowPanelLay.ColumnSpacing = 0;
app.ImgShow = uiimage(imgShowPanelLay);
% WordsShowTA
resultShowPanelLay = uigridlayout(resultShowPanel, [1, 1]);
resultShowPanelLay.RowSpacing = 0;
resultShowPanelLay.ColumnSpacing = 0;
app.WordsShowTA = uitextarea(resultShowPanelLay);
app.WordsShowTA.FontSize = 22;

% --------------------------------------
% % ContainerForSetup
% --------------------------------------
app.ContainerForSetup = uigridlayout(app.UIFig, [4, 3]);
app.ContainerForSetup.RowHeight = {22, 22, 22, '1x'};
app.ContainerForSetup.ColumnWidth = {'1x', '1x', '2.5x'};
app.ContainerForSetup.Visible = 'off';
apiKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'API Key');
apiKeyLabel.HorizontalAlignment = 'right';
apiKeyLabel.Layout.Row = 1;
apiKeyLabel.Layout.Column = 1;
% APIKeyText
app.APIKeyText = uieditfield(app.ContainerForSetup);
app.APIKeyText.Layout.Row = 1;
app.APIKeyText.Layout.Column = 2;
secrectKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'Secrect Key');
secrectKeyLabel.HorizontalAlignment = 'right';
secrectKeyLabel.Layout.Row = 2;
secrectKeyLabel.Layout.Column = 1;
% SecrectKeyText
app.SecrectKeyText = uieditfield(app.ContainerForSetup);
app.SecrectKeyText.Layout.Row = 2;
app.SecrectKeyText.Layout.Column = 2;
% ResetBtn
app.ResetBtn = uibutton(app.ContainerForSetup, 'Text', 'Reset');
app.ResetBtn.Layout.Row = 3;
app.ResetBtn.Layout.Column = 1;
% SaveBtn
app.SaveBtn = uibutton(app.ContainerForSetup, 'Text', 'Save');
app.SaveBtn.Layout.Row = 3;
app.SaveBtn.Layout.Column = 2;
% Set visibility for UIFig
movegui(app.UIFig, 'center');
app.UIFig.Visible = 'on';

% --------------------------------------
% % RunstartupFcn
% --------------------------------------
app.runStartupFcn(@startupFcn);
end % end buildApp

% startupFcn
function startupFcn(app, ~, ~)
% Setup APIKeyText and SecrectKeyText
if exist('apikey.mat', 'file')
temp = load('apikey.mat');
app.APIKeyText.Value = temp.key.apiKeyVal;
app.APIKeyText.Editable = 'off';
app.SecrectKeyText.Value = temp.key.secrectKeyVal;
app.SecrectKeyText.Editable = 'off';
end

% Register callback
app.SnippingToolBtn.ClickedCallback = @app.clickedSnippingToolBtn;
app.ImgLoadToolBtn.ClickedCallback = @app.clickedImgLoadToolBtn;
app.SetupToolBtn.ClickedCallback = @app.clickedSetupToolBtn;
app.CleanToolBtn.ClickedCallback = @app.clickedCleanToolBtn;

app.ResetBtn.ButtonPushedFcn = @app.callbackResetBtn;
app.SaveBtn.ButtonPushedFcn = @app.callbackSaveBtn;
end % end function
end % methods
end % end classdef

由此,我们总共为6个按钮注册了6个回调方法,需要都进行实现,不然触发按钮时,该按钮不会做出响应。简单起见,这里我们以实现设置界面中的SaveBtn的回调方法callbackSaveBtn为例子来说明。

在没有设置APIKeySecrectKey前,触发SnippingToolBtn或者ImgLoadToolBtn会有先进行设置的提示:

image

callbackSaveBtn方法实现的逻辑:首先由HasSetup属性判断是否进行了APIKeySecrectKey的设置(初始默认是false没有设置),如果没有设置,会提示没有APIKeySecrectKey,则需要输入APIKeySecrectKey的值,然后点击保存按钮,那么后台会将获取到的值存储下来(.mat文件),更新HasSetup的值为true,后续我们就不必要再次输入了,要想更换值的话,点击重置按钮重新配置即可;如果进行了设置(HasSetup属性为true),直接保存即可。

具体的代码如下:

 % --------------------------------------
% % Callback functions
% --------------------------------------
% callbackSaveBtn
function callbackSaveBtn(app, ~, ~)
if ~isempty(app.SecrectKeyText.Value) && ~isempty(app.APIKeyText.Value)
key.apiKeyVal = app.APIKeyText.Value;
key.secrectKeyVal = app.SecrectKeyText.Value;
if exist('apikey.mat', 'file')
delete('apikey.mat');
end
save('apikey.mat', 'key');
!attrib +s +h apikey.mat
uialert(app.UIFig, 'Save successfully!', 'Confirm', 'Icon', 'success');
app.APIKeyText.Editable = 'off';
app.SecrectKeyText.Editable = 'off';
else
uialert(app.UIFig, 'API Key or Secrect Key is empty!', 'Confirm', 'Icon', 'warning');
end % end if
end % callbackSaveBtn

实现了保存按钮的功能后,就可以得到如下动图所示的效果了。

image

其他的回调函数源代码:

 % clickedSnippingToolBtn
function clickedSnippingToolBtn(app, ~, ~)
if ~isempty(app.SecrectKeyText.Value) && ~isempty(app.APIKeyText.Value)
app.UIFig.Visible = 'off';
pause(0.1);
outFileName = 'temp.png';
cropImg(outFileName);
!attrib +s +h temp.png
%
app.ImgShow.ImageSource = imread(outFileName);
app.UIFig.Visible = 'on';
%
apiURL = 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic';
words = getWordsByBaiduOCR(outFileName, app.APIKeyVal, app.SecrectKeyVal, '', apiURL, 'MultiLine');
app.WordsShowTA.Value = words;
else
msg = {'API Key or Secrect Key is empty!'; 'Please set it up first!'};
uialert(app.UIFig, msg, 'Confirm', 'Icon', 'warning');
end
end % end clickedSnippingToolBtn

% clickedImgLoadToolBtn
function clickedImgLoadToolBtn(app, ~, ~)
if ~isempty(app.SecrectKeyText.Value) && ~isempty(app.APIKeyText.Value)
[fName, fPath] = uigetfile({'.png'; '.jpg'; '.bmp'; '.tif'}, 'Open image');
if ~isequal(any([fName, fPath]), 0)
img = imread(strcat(fPath, fName));
outFileName = 'temp.png';
if exist(outFileName, 'file')
delete(outFileName)
end
imwrite(img, outFileName);
!attrib +s +h temp.png
%
app.ImgShow.ImageSource = imread(outFileName);
app.UIFig.Visible = 'on';
%
apiURL = 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic';
words = getWordsByBaiduOCR(outFileName, app.APIKeyVal, app.SecrectKeyVal, '', apiURL, 'MultiLine');
app.WordsShowTA.Value = words;
else
return
end % end if
else % end if
msg = {'API Key or Secrect Key is empty!'; 'Please set it up first!'};
uialert(app.UIFig, msg, 'Confirm', 'Icon', 'warning');
end
end % end clickedImgLoadToolBtn

% clickedSetupToolBtn
function clickedSetupToolBtn(app, ~, ~)
if ~app.HasSetup
app.ContainerForMain.Visible = 'off';
app.ContainerForSetup.Visible = 'on';
app.HasSetup = true;
else
app.ContainerForMain.Visible = 'on';
app.ContainerForSetup.Visible = 'off';
app.HasSetup = false;
end
end % end clickedSetupToolBtn

% clickedCleanToolBtn
function clickedCleanToolBtn(app, ~, ~)
app.WordsShowTA.Value = '';
app.ImgShow.ImageSource = '';
end % end clickedCleanToolBtn

% callbackResetBtn
function callbackResetBtn(app, ~, ~)
app.APIKeyText.Value = '';
app.APIKeyText.Editable = 'on';
app.SecrectKeyText.Value = '';
app.SecrectKeyText.Editable = 'on';
end % callbackResetBtn

四、使用演示

现在让我们来测试一下搭建的图像识别工具吧,比如,某麻子同学是一名研究生,在阅读那种扫描版的pdf文献时,想把其中的一段语句复制下来用于记录笔记或者做PPT用,这时我们的工具就派上用场了:

image

刹那间,某麻子同学得到了想要的结果,露出了久违的幸福的一笑!

image

五、结语

至此,我们完成了一个比较完整的文字识别工具!希望您喜欢,并且可以从中获得有用的东西。

本文完整代码,请在GZH内回复“文字识别工具”进行下载。

【往期推荐】

  • 矩阵2-范数化的向量化方法 (qq.com)

  • texStudio主题配置 (qq.com)

  • 送福利啦 (qq.com)

  • 如何用Matlab一键下载B站高清视频(下) (qq.com)

  • Python中的装饰器 (qq.com)

  • MATLAB 风格指南 2.0 (qq.com)

  • 匿名函数(Anonymous Function) (qq.com)

  • 猜猜今天的干货有哪些? (qq.com)

  • 分享爬取Matlab中文论坛基础讨论的源代码 (qq.com)

  • 如何用Matlab一键下载B站高清视频(上) (qq.com)

  • 爬取某学者主页上的文献 (qq.com)

你可能感兴趣的:(手把手教你,一个案例学会用Matlab App Designer设计文字识别工具(附源码))