手把手教你，一个案例学会用Matlab App Designer设计文字识别工具（附源码）

一、前言

有时候在读电子文档的过程中，往往会遇到图片形式的文本，想要复制下来，记个笔记甚是不便，需要对照着打字输入，活生生被逼成键盘侠啊......

image

被逼无奈，何不自己造个轮子，开发一款自己专属的文字识别工具呢，于是我们找到了Matlab App Designer。

玩过 Matlab 的朋友们都知道，构建图形用户界面，Matlab提供了两种工具，一是用guide构建，俗称GUI，在未来版本中会移除；二是用App Designer，俗称App，这是官方推荐的，也是以后主流的框架。

今天我们就通过一个简单案例来介绍如何利用App设计一个图片文字识别工具。

搭建的方式主要有两种：

App设计器：灵活、方便、简单，现代化方法；
基于uifigure的编程方式：灵活、重构方便，适合构建复杂、大型的图形用户界面，原始社会方法。

这里我们就以编程方式进行创建。

二、预备

1. API接口

文字识别涉及到光学字符识别（Optical Character Recognition，OCR）技术，如果我们自己造这种底层的轮子，要有高精度的识别率，那估计累得够呛。

幸运的是市场上已经有成熟的工具了，如百度智能云、阿里云、科大讯飞等均提供了API接口，只需借过来用就完事。这里主要以百度智能云提供的文字识别API为例。

免费申请文字识别功能后，在控制台可以查看到API Key和Secret Key，由这两个参数可以获得access_token，它是调用API接口的必需参数（如下图红色方框所示）。

image

通过查看文字识别的技术文档，我们可以得到通用文字识别（标准版）的请求接口，如下：

HTTP 方法：POST

请求URL： https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic

URL参数：属性名：access_token，值：通过API Key和Secret Key获取的access_token，参考“Access Token获取”

Header：属性名：Content-Type，值：application/x-www-form-urlencoded

请求参数：属性名：image，值：图像数据，base64编码后进行urlencode，要求base64编码和urlencode后大小不超过4M，最短边至少15px，最长边最大4096px，支持jpg/jpeg/png/bmp格式

返回参数：属性名：words_result，值：识别结果数组

关于具体的HTTP请求过程接下来会细聊。

2. 图像的Base64编码

Base64是网络上最常见的用于传输8Bit字节码的编码方式之一，它是包括小写字母a-z、大写字母A-Z、数字0-9、符号+、/共64个字符的字符集，等号=用来作为后缀用途。任何符号都可以转换成这个字符集中的字符，该转换过程就叫做Base64编码。Base64编码具有不可读性，需要解码后才能阅读。

许多编程语言都提供了现成的Base64编码库函数，Matlab也不例外，大家不妨 help matlab.net.base64encode查看细节。

下面提供三种Matlab中的实现方式：

Java类---org.apache.commons.codec.binary.Base64 和 matlab.net.base64encode

 function base64string = img2base64(fileName)
 %IMG2BASE64 Coding an image to base64 file
 % INPUTS:
 % fileName string, an image file name
 % OUTPUTS:
 % base64string string, the input image's base64 code
 % USAGE:
 % >>base64string = img2base64('1.jpg')
 % >>base64string = 'xxx'
 %
 try
 fid = fopen(fileName, 'rb');
 bytes = fread(fid);
 fclose(fid);
 % -------------------------------------------
 % First method
 % -------------------------------------------
 encoder = org.apache.commons.codec.binary.Base64;
 base64string = char(encoder.encode(bytes))';
 % -------------------------------------------
 % Second method
 % -------------------------------------------
 % base64string = matlab.net.base64encode(bytes);
 catch
 disp('The file does not exist!');
 base64string = '';
 end % end try
 end % end function

使用Python base64模块

Matlab中可以直接使用Python，那Python中提供的模块base64就可以直接使用了，源代码如下：

 function base64string = img2base64_(fileName)
 %IMG2BASE64 Coding an image to base64 file
 % INPUTS:
 % fileName string, an image file name
 % OUTPUTS:
 % base64string string, the input image's base64 code
 % USAGE:
 % >>base64string = img2base64('1.jpg')
 % >>base64string = 'xxx'
 %
 try
 f = py.open(fileName, 'rb');
 bytes = f.read();
 f.close();
 temp = char(py.base64.b64encode(bytes));
 temp = regexp(temp, '(?<=b'').+(?='')', 'match');
 base64string = temp{1};
 catch
 disp('The file does not exist!');
 base64string = '';
 end % end try
 end % end function

我们可以对如下所示的同一张图片（500 x 500）进行base64编码，比较一下编码速度：

image

结果：'/9j/4AAQSkZ...AAAAAAD/9k='

Java类---org.apache.commons.codec.binary.Base64 ⏲ 0.000783 秒

matlab.net.base64encode ⏲ 0.017589 秒

Python base64模块 ⏲ 0.000709 秒

可以发现使用Java类和Python base64模块的方法，速度相当，而使用matlab.net.base64encode速度要慢20多倍，但编码一张大小为500 x 500的图像耗时0.02秒左右，其速度是非常之快了。

综合一下，我们推荐使用org.apache.commons.codec.binary.Base64类进行base64编码。

3. 屏幕截图

识别扫描版pdf文档、视频教程等中的文字时，我们需要对待识别文字所在区域截个图，保存为图像再进行后续识别操作。要实现上述过程，首先需要对屏幕进行截图，Matlab通过借助java.awt.Robot这个Java类来实现，截屏源代码如下所示：

 function imgData = screenSnipping
 %screenSnipping Capturel full-screen to an image
 % Output:
 % imgData, uint8, image data.
 % Source code from: https://www.mathworks.com/support/search.html/answers/362358-how-do-i-take-a-screenshot-using-matlab.html?fq=asset_type_name:answer%20category:matlab/audio-and-video&page=1
 % Modified: Qingpinwangzi
 % Date: Apr 14, 2021. 
 % Take screen capture
 robo = java.awt.Robot;
 tk = java.awt.Toolkit.getDefaultToolkit();
 rectSize = java.awt.Rectangle(tk.getScreenSize());
 cap = robo.createScreenCapture(rectSize); 
 % Convert to an RGB image
 rgb = typecast(cap.getRGB(0, 0, cap.getWidth, cap.getHeight, [], 0, cap.getWidth), 'uint8');
 imgData = zeros(cap.getHeight, cap.getWidth, 3, 'uint8');
 imgData(:, :, 1) = reshape(rgb(3:4:end), cap.getWidth, [])';
 imgData(:, :, 2) = reshape(rgb(2:4:end), cap.getWidth, [])';
 imgData(:, :, 3) = reshape(rgb(1:4:end), cap.getWidth, [])';
 end

4. 调用百度API识别文字

上述第1节中我们提到过，access_token是调用API接口的必需参数。通过阅读技术文档得知，需要API Key和Secret Key进行http请求就可以获得，核心代码如下：

 url = ['https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=', apiKey, '&client_secret=', secretKey];
 res = webread(url, options);
 access_token = res.access_token;

有了access_token我们就可以调用文字识别API进行文字识别了，这里再分享下识别文字的源代码：

 function result = getWordsByBaiduOCR(fileName, apiKey, secretKey, accessToken, apiURL, outType)
 %GETWORDSBYBAIDUOCR return recognition words
 % INPUTS:
 % fileName string, an image file name
 % apiKey string, the API Key of the application
 % secretKey string, The Secret Key of the application
 % accessToken string, default is '', get the Access Token by API
 % Key and Secret Key.
 % apiURL string, such as:
 % 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate'
 % 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic'
 % 'https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic'
 % outType, 'MultiLine|SingleLine'
 % OUTPUTS:
 % result []|struct
 % USAGE:
 % >>result = getWordsByBaiduOCR(fileName, apiKey, secretKey, accessToken, apiURL)
 % Date: Mar 18, 2021.
 % Author: 清贫王子
 % 
 options = weboptions('RequestMethod', 'post');
 if isempty(outType)
 outType = 'MultiLine';
 end 
 if isempty(accessToken)
 url = ['https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=', apiKey, '&client_secret=', secretKey];
 res = webread(url, options);
 access_token = res.access_token;
 else
 access_token = accessToken;
 end % end if 
 url = [apiURL, '?access_token=', access_token];
 options.HeaderFields = { 'Content-Type', 'application/x-www-form-urlencoded'};
 imgBase64String = img2base64(fileName);
 if isempty(imgBase64String)
 result = '';
 return
 end % end if
 res = webwrite(url, 'image', imgBase64String, options);
 wordsRsult = res.words_result; 
 data.ocrResultChar = ''; 
 if strcmp(outType, 'SingleLine')
 for ii = 1 : size(wordsRsult, 1)
 data.ocrResultChar = [data.ocrResultChar, wordsRsult(ii,1).words];
 end % end for 
 elseif strcmp(outType, 'MultiLine')
 for ii = 1 : size(wordsRsult, 1)
 data.ocrResultChar{ii} = wordsRsult(ii,1).words;
 end % end for
 end
 result = data.ocrResultChar;
 end % end function

简单测试下这个函数，输入下面所示的图片，我们进行图片（截图地址：https://ww2.mathworks.cn/products/matlab/app-designer.html）中的文字识别。

image

 >> result = 
 1×7 cell 数组 
 列 1 至 4 
 {'App设计工具帮助您…'} {'开发专业背景。您只…'} {'面(GUI)设计布局,…'} {'编程。'} 
 列 5 至 7 
 {'要共享App,您可以使…'} {' MATLAB Compile…'} {'桌面App或 Web App'} 
  
   
   result{1} 
   
  
 ans = 
 'App设计工具帮助您创建专业的App,同时并不要求软件'

识别结果中共有7个cell，代表识别了图片中的7行文字，即1个cell对应1行识别的文字，如result{1}的结果。

三、工具搭建

以基于uifigure的编程方式创建APP，我们推荐面向对象（OOP）方法编程，简单起见，这里主要封装一个类来实现所需的功能。当然更标准的做法是利用MVC等设计模式将界面和逻辑分离，能达到对扩展开放，对修改封闭的软件设计原则。

1. 功能需求

我们的功能需求非常简单，主要有以下两个功能：

识别已经存在的图像中的文字
识别扫描版pdf文档、视频教程等中的文字

实现第1个功能，我们只需要加载图像，然后调用识别函数进行识别，将识别结果显示到文本区域就可以了；而实现第2个功能，首先需要屏幕截图，选取待识别文字所在的区域，存储为图像，后续处理和实现第1个功能的一样。

根据上述描述，我们需要的控件有：加载图像按钮，截图按钮，图像显示器，识别结果显示文本域。另外，需要一个清理按钮，用于清除显示的图像和识别结果；还需要一个设置按钮，用于配置API Key和Secret Key。

便于叙述，我们先展示下最终设计的结果，如下图所示：

image	image
文字识别工具主界面	设置界面

在设置界面中，需要两个标签和两个文本框，两外需要两个按钮。据此，我们需要的控件都清楚了，接下来让我们一起来创建他们吧！

2. 实现细节

主要封装一个类来实现所需的功能，我们给这个类起个名：ReadWords，这个类需要继承matlab.apps.AppBase,它的属性就是界面中的所有控件，那么这个类看上去应该是这样的：

classdef ReadWords < matlab.apps.AppBase
%%
properties
UIFig matlab.ui.Figure

ContainerForMain matlab.ui.container.GridLayout
ThisTB matlab.ui.container.Toolbar
SnippingToolBtn matlab.ui.container.toolbar.PushTool
ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
SetupToolBtn matlab.ui.container.toolbar.PushTool
CleanToolBtn matlab.ui.container.toolbar.PushTool
ImgShow matlab.ui.control.Image
WordsShowTA matlab.ui.control.TextArea

ContainerForSetup matlab.ui.container.GridLayout
APIKeyText matlab.ui.control.EditField
SecrectKeyText matlab.ui.control.EditField
ResetBtn matlab.ui.control.Button
SaveBtn matlab.ui.control.Button
end % end properties

%%
properties(Hidden, Dependent)
APIKeyVal
SecrectKeyVal
end % end properties

%%
properties(Access = protected)
HasSetup = false
end % end properties

end % end classdef

下面说明下一些重要的属性

公有属性：

UIFig 必须是matlab.ui.Figure类的属性，通过uifigure构造，这是整个工具的主窗口
ContainerForMain 必须是matlab.ui.container.GridLayout类的属性，通过uigridlayout构造，这是主窗口的布局容器
ThisTB 必须是matlab.ui.container.Toolbar类的属性，通过uitoolbar构造，这是工具栏的容器，用于放置SnippingToolBtn、ImgLoadToolBtn、SetupToolBtn、CleanToolBtn这4个工具按钮
ImgShow 必须是matlab.ui.control.Image类的属性，通过uiimage构造，用于显示加载或者截图后的图像
WordsShowTA 必须是matlab.ui.control.TextArea类的属性，通过uitextarea构造，用于显示文字识别结果
ContainerForSetup 设置界面中的网格容器
APIKeyText和SecrectKeyText 主要用于输入APIKey和SecrectKey
ResetBtn和SaveBtn两个按钮分别用来实现重置和保存APIKey和SecrectKey

从属、隐藏属性：

APIKeyVal 用于接收APIKeyText中输入的APIKey的值
SecrectKeyVal 用于接收SecrectKeyText中输入的SecrectKey的值

受保护属性：

HasSetup 用于标识是否配置了APIKey和SecrectKey，默认为false

至此，我们设置好了所有的属性，然后进行构造方法、析构方法以及类方法的编写。

加上构造方法、析构方法以及从属属性APIKeyVal和SecrectKeyVal的get方法的代码后看上去是这样的：

 classdef ReadWords < matlab.apps.AppBase
 %%
 properties
 UIFig matlab.ui.Figure 
 ContainerForMain matlab.ui.container.GridLayout
 ThisTB matlab.ui.container.Toolbar
 SnippingToolBtn matlab.ui.container.toolbar.PushTool
 ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
 SetupToolBtn matlab.ui.container.toolbar.PushTool
 CleanToolBtn matlab.ui.container.toolbar.PushTool
 ImgShow matlab.ui.control.Image
 WordsShowTA matlab.ui.control.TextArea 
 ContainerForSetup matlab.ui.container.GridLayout
 APIKeyText matlab.ui.control.EditField
 SecrectKeyText matlab.ui.control.EditField
 ResetBtn matlab.ui.control.Button
 SaveBtn matlab.ui.control.Button
 end % end properties 
 %%
 properties(Hidden, Dependent)
 APIKeyVal
 SecrectKeyVal
 end % end properties 
 %%
 properties(Access = protected)
 HasSetup = false
 end % end properties 
 %%
 methods
 % --------------------------------------
 % % Constructor
 % --------------------------------------
 function app = ReadWords
 % Create UIFigure and components
 app.buildApp();
 % Register the app with App Designer
 registerApp(app, app.UIFig) 
 if nargout == 0
 clear app
 end
 end % end Constructor 
 % --------------------------------------
 % % Destructor
 % --------------------------------------
 % Code that executes before app deletion
 function delete(app)
 % Delete UIFigure when app is deleted
 delete(app.UIFig)
 end % end Constructor 
 % --------------------------------------
 % % Get/Set methods
 % --------------------------------------
 % get.APIKeyVal
 function apiKeyVal = get.APIKeyVal(app)
 apiKeyVal = app.APIKeyText.Value;
 end 
 % get.SecrectKeyVal
 function secrectKeyVal = get.SecrectKeyVal(app)
 secrectKeyVal = app.SecrectKeyText.Value;
 end
 end % end methods
 end % end classdef

析构方法（Destructor）的写法是固定的，构造方法中的registerApp(app, app.UIFig)也是固定的，另外的buildApp()方法就用来创建界面、注册各个控件。

我们将后续的方法都创建为私有方法，添加了buildApp()方法后的整个ReadWords类是下面这样的：

 classdef ReadWords < matlab.apps.AppBase
 %%
 properties
 UIFig matlab.ui.Figure 
 ContainerForMain matlab.ui.container.GridLayout
 ThisTB matlab.ui.container.Toolbar
 SnippingToolBtn matlab.ui.container.toolbar.PushTool
 ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
 SetupToolBtn matlab.ui.container.toolbar.PushTool
 CleanToolBtn matlab.ui.container.toolbar.PushTool
 ImgShow matlab.ui.control.Image
 WordsShowTA matlab.ui.control.TextArea 
 ContainerForSetup matlab.ui.container.GridLayout
 APIKeyText matlab.ui.control.EditField
 SecrectKeyText matlab.ui.control.EditField
 ResetBtn matlab.ui.control.Button
 SaveBtn matlab.ui.control.Button
 end % end properties 
 %%
 properties(Hidden, Dependent)
 APIKeyVal
 SecrectKeyVal
 end % end properties 
 %%
 properties(Access = protected)
 HasSetup = false
 end % end properties 
 %%
 methods
 % --------------------------------------
 % % Constructor
 % --------------------------------------
 function app = ReadWords
 % Create UIFigure and components
 app.buildApp();
 % Register the app with App Designer
 registerApp(app, app.UIFig) 
 if nargout == 0
 clear app
 end
 end % end Constructor 
 % --------------------------------------
 % % Destructor
 % --------------------------------------
 % Code that executes before app deletion
 function delete(app)
 % Delete UIFigure when app is deleted
 delete(app.UIFig)
 end % end Constructor 
 % --------------------------------------
 % % Get/Set methods
 % --------------------------------------
 % get.APIKeyVal
 function apiKeyVal = get.APIKeyVal(app)
 apiKeyVal = app.APIKeyText.Value;
 end 
 % get.SecrectKeyVal
 function secrectKeyVal = get.SecrectKeyVal(app)
 secrectKeyVal = app.SecrectKeyText.Value;
 end
 end % end methods 
 %%
 methods(Access = private)
 % buildApp
 function buildApp(app)
 %
 % --------------------------------------
 % % Main Figure
 % --------------------------------------
 app.UIFig = uifigure();
 app.UIFig.Icon = 'icons/img2text.png';
 app.UIFig.Name = 'ReadWords';
 app.UIFig.Visible = 'off';
 app.UIFig.Position = [app.UIFig.Position(1), app.UIFig.Position(2), 745, 420];
 app.UIFig.AutoResizeChildren = 'on';
 app.UIFig.Units = 'Normalized';
 app.setAutoResize(app.UIFig, true); 
 % --------------------------------------
 % % Toolbar
 % --------------------------------------
 app.ThisTB = uitoolbar(app.UIFig);
 % SetupToolBtn
 app.SetupToolBtn = uipushtool(app.ThisTB);
 app.SetupToolBtn.Icon = 'icons/setup.png';
 app.SetupToolBtn.Tooltip = 'Setup'; 
 % SnippingToolBtn
 app.SnippingToolBtn = uipushtool(app.ThisTB);
 app.SnippingToolBtn.Icon = 'icons/snip.png';
 app.SnippingToolBtn.Tooltip = 'Screenshot'; 
 % ImgLoadToolBtn
 app.ImgLoadToolBtn = uipushtool(app.ThisTB);
 app.ImgLoadToolBtn.Icon = 'icons/load.png';
 app.ImgLoadToolBtn.Tooltip = 'Load image'; 
 % CleanToolBtn
 app.CleanToolBtn = uipushtool(app.ThisTB);
 app.CleanToolBtn.Icon = 'icons/clean.png';
 app.CleanToolBtn.Tooltip = 'Clean'; 
 % --------------------------------------
 % % ContainerForMain
 % --------------------------------------
 app.ContainerForMain = uigridlayout(app.UIFig, [1, 2]); 
 % ContainerForMain
 imgShowPanel = uipanel(app.ContainerForMain, 'Title', 'Original');
 resultShowPanel = uipanel(app.ContainerForMain, 'Title', 'Result');
 % ImgShow
 imgShowPanelLay = uigridlayout(imgShowPanel, [1, 1]);
 imgShowPanelLay.RowSpacing = 0;
 imgShowPanelLay.ColumnSpacing = 0;
 app.ImgShow = uiimage(imgShowPanelLay);
 % WordsShowTA
 resultShowPanelLay = uigridlayout(resultShowPanel, [1, 1]);
 resultShowPanelLay.RowSpacing = 0;
 resultShowPanelLay.ColumnSpacing = 0;
 app.WordsShowTA = uitextarea(resultShowPanelLay);
 app.WordsShowTA.FontSize = 22; 
 % --------------------------------------
 % % ContainerForSetup
 % --------------------------------------
 app.ContainerForSetup = uigridlayout(app.UIFig, [4, 3]);
 app.ContainerForSetup.RowHeight = {22, 22, 22, '1x'};
 app.ContainerForSetup.ColumnWidth = {'1x', '1x', '2.5x'};
 app.ContainerForSetup.Visible = 'off';
 apiKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'API Key');
 apiKeyLabel.HorizontalAlignment = 'right';
 apiKeyLabel.Layout.Row = 1;
 apiKeyLabel.Layout.Column = 1;
 % APIKeyText
 app.APIKeyText = uieditfield(app.ContainerForSetup);
 app.APIKeyText.Layout.Row = 1;
 app.APIKeyText.Layout.Column = 2;
 secrectKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'Secrect Key');
 secrectKeyLabel.HorizontalAlignment = 'right';
 secrectKeyLabel.Layout.Row = 2;
 secrectKeyLabel.Layout.Column = 1;
 % SecrectKeyText
 app.SecrectKeyText = uieditfield(app.ContainerForSetup);
 app.SecrectKeyText.Layout.Row = 2;
 app.SecrectKeyText.Layout.Column = 2;
 % ResetBtn
 app.ResetBtn = uibutton(app.ContainerForSetup, 'Text', 'Reset');
 app.ResetBtn.Layout.Row = 3;
 app.ResetBtn.Layout.Column = 1;
 % SaveBtn
 app.SaveBtn = uibutton(app.ContainerForSetup, 'Text', 'Save');
 app.SaveBtn.Layout.Row = 3;
 app.SaveBtn.Layout.Column = 2;
 % Set visibility for UIFig
 movegui(app.UIFig, 'center');
 app.UIFig.Visible = 'on'; 
 % --------------------------------------
 % % RunstartupFcn
 % --------------------------------------
 app.runStartupFcn(@startupFcn); 
 end % end buildApp
 end % methods
 end % end classdef

需要注意的是，工具栏按钮和窗口的图标来源于：https://www.easyicon.cc/。一些常见的图标素材都可以从中免费下载。我们已经将图标下载完毕，需要的朋友可以点击下方链接来下载：

链接：https://pan.baidu.com/s/11kIvt4SX-MhQ2ltEeC18ZA 提取码：5i3k

另外，app.runStartupFcn(@startupFcn);语句调用的是父类matlab.apps.AppBase的方法，我们将各个控件的注册任务放在startupFcn这个方法中完成。这里不妨先注释掉这个语句，直接运行ReadWords.m便可以显示出我们刚才在buildApp方法中构造的界面了，动图演示如下：

image

可以看到，我们在点击工具栏各个按钮时，没有反应，这是因为到目前为止我们还没有给各个控件注册回调方法，那接下来将会在startupFcn这个方法中完成各个控件的注册任务，代码如下：

 classdef ReadWords < matlab.apps.AppBase
 %%
 properties
 UIFig matlab.ui.Figure 
 ContainerForMain matlab.ui.container.GridLayout
 ThisTB matlab.ui.container.Toolbar
 SnippingToolBtn matlab.ui.container.toolbar.PushTool
 ImgLoadToolBtn matlab.ui.container.toolbar.PushTool
 SetupToolBtn matlab.ui.container.toolbar.PushTool
 CleanToolBtn matlab.ui.container.toolbar.PushTool
 ImgShow matlab.ui.control.Image
 WordsShowTA matlab.ui.control.TextArea 
 ContainerForSetup matlab.ui.container.GridLayout
 APIKeyText matlab.ui.control.EditField
 SecrectKeyText matlab.ui.control.EditField
 ResetBtn matlab.ui.control.Button
 SaveBtn matlab.ui.control.Button
 end % end properties 
 %%
 properties(Hidden, Dependent)
 APIKeyVal
 SecrectKeyVal
 end % end properties 
 %%
 properties(Access = protected)
 HasSetup = false
 end % end properties 
 %%
 methods
 % --------------------------------------
 % % Constructor
 % --------------------------------------
 function app = ReadWords
 % Create UIFigure and components
 app.buildApp();
 % Register the app with App Designer
 registerApp(app, app.UIFig) 
 if nargout == 0
 clear app
 end
 end % end Constructor 
 % --------------------------------------
 % % Destructor
 % --------------------------------------
 % Code that executes before app deletion
 function delete(app)
 % Delete UIFigure when app is deleted
 delete(app.UIFig)
 end % end Constructor 
 % --------------------------------------
 % % Get/Set methods
 % --------------------------------------
 % get.APIKeyVal
 function apiKeyVal = get.APIKeyVal(app)
 apiKeyVal = app.APIKeyText.Value;
 end 
 % get.SecrectKeyVal
 function secrectKeyVal = get.SecrectKeyVal(app)
 secrectKeyVal = app.SecrectKeyText.Value;
 end
 end % end methods 
 %%
 methods(Access = private)
 % buildApp
 function buildApp(app)
 %
 % --------------------------------------
 % % Main Figure
 % --------------------------------------
 app.UIFig = uifigure();
 app.UIFig.Icon = 'icons/img2text.png';
 app.UIFig.Name = 'ReadWords';
 app.UIFig.Visible = 'off';
 app.UIFig.Position = [app.UIFig.Position(1), app.UIFig.Position(2), 745, 420];
 app.UIFig.AutoResizeChildren = 'on';
 app.UIFig.Units = 'Normalized';
 app.setAutoResize(app.UIFig, true); 
 % --------------------------------------
 % % Toolbar
 % --------------------------------------
 app.ThisTB = uitoolbar(app.UIFig);
 % SetupToolBtn
 app.SetupToolBtn = uipushtool(app.ThisTB);
 app.SetupToolBtn.Icon = 'icons/setup.png';
 app.SetupToolBtn.Tooltip = 'Setup'; 
 % SnippingToolBtn
 app.SnippingToolBtn = uipushtool(app.ThisTB);
 app.SnippingToolBtn.Icon = 'icons/snip.png';
 app.SnippingToolBtn.Tooltip = 'Screenshot'; 
 % ImgLoadToolBtn
 app.ImgLoadToolBtn = uipushtool(app.ThisTB);
 app.ImgLoadToolBtn.Icon = 'icons/load.png';
 app.ImgLoadToolBtn.Tooltip = 'Load image'; 
 % CleanToolBtn
 app.CleanToolBtn = uipushtool(app.ThisTB);
 app.CleanToolBtn.Icon = 'icons/clean.png';
 app.CleanToolBtn.Tooltip = 'Clean'; 
 % --------------------------------------
 % % ContainerForMain
 % --------------------------------------
 app.ContainerForMain = uigridlayout(app.UIFig, [1, 2]); 
 % ContainerForMain
 imgShowPanel = uipanel(app.ContainerForMain, 'Title', 'Original');
 resultShowPanel = uipanel(app.ContainerForMain, 'Title', 'Result');
 % ImgShow
 imgShowPanelLay = uigridlayout(imgShowPanel, [1, 1]);
 imgShowPanelLay.RowSpacing = 0;
 imgShowPanelLay.ColumnSpacing = 0;
 app.ImgShow = uiimage(imgShowPanelLay);
 % WordsShowTA
 resultShowPanelLay = uigridlayout(resultShowPanel, [1, 1]);
 resultShowPanelLay.RowSpacing = 0;
 resultShowPanelLay.ColumnSpacing = 0;
 app.WordsShowTA = uitextarea(resultShowPanelLay);
 app.WordsShowTA.FontSize = 22; 
 % --------------------------------------
 % % ContainerForSetup
 % --------------------------------------
 app.ContainerForSetup = uigridlayout(app.UIFig, [4, 3]);
 app.ContainerForSetup.RowHeight = {22, 22, 22, '1x'};
 app.ContainerForSetup.ColumnWidth = {'1x', '1x', '2.5x'};
 app.ContainerForSetup.Visible = 'off';
 apiKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'API Key');
 apiKeyLabel.HorizontalAlignment = 'right';
 apiKeyLabel.Layout.Row = 1;
 apiKeyLabel.Layout.Column = 1;
 % APIKeyText
 app.APIKeyText = uieditfield(app.ContainerForSetup);
 app.APIKeyText.Layout.Row = 1;
 app.APIKeyText.Layout.Column = 2;
 secrectKeyLabel = uilabel(app.ContainerForSetup, 'Text', 'Secrect Key');
 secrectKeyLabel.HorizontalAlignment = 'right';
 secrectKeyLabel.Layout.Row = 2;
 secrectKeyLabel.Layout.Column = 1;
 % SecrectKeyText
 app.SecrectKeyText = uieditfield(app.ContainerForSetup);
 app.SecrectKeyText.Layout.Row = 2;
 app.SecrectKeyText.Layout.Column = 2;
 % ResetBtn
 app.ResetBtn = uibutton(app.ContainerForSetup, 'Text', 'Reset');
 app.ResetBtn.Layout.Row = 3;
 app.ResetBtn.Layout.Column = 1;
 % SaveBtn
 app.SaveBtn = uibutton(app.ContainerForSetup, 'Text', 'Save');
 app.SaveBtn.Layout.Row = 3;
 app.SaveBtn.Layout.Column = 2;
 % Set visibility for UIFig
 movegui(app.UIFig, 'center');
 app.UIFig.Visible = 'on'; 
 % --------------------------------------
 % % RunstartupFcn
 % --------------------------------------
 app.runStartupFcn(@startupFcn);
 end % end buildApp 
 % startupFcn
 function startupFcn(app, ~, ~)
 % Setup APIKeyText and SecrectKeyText
 if exist('apikey.mat', 'file')
 temp = load('apikey.mat');
 app.APIKeyText.Value = temp.key.apiKeyVal;
 app.APIKeyText.Editable = 'off';
 app.SecrectKeyText.Value = temp.key.secrectKeyVal;
 app.SecrectKeyText.Editable = 'off';
 end 
 % Register callback
 app.SnippingToolBtn.ClickedCallback = @app.clickedSnippingToolBtn;
 app.ImgLoadToolBtn.ClickedCallback = @app.clickedImgLoadToolBtn;
 app.SetupToolBtn.ClickedCallback = @app.clickedSetupToolBtn;
 app.CleanToolBtn.ClickedCallback = @app.clickedCleanToolBtn; 
 app.ResetBtn.ButtonPushedFcn = @app.callbackResetBtn;
 app.SaveBtn.ButtonPushedFcn = @app.callbackSaveBtn;
 end % end function
 end % methods
 end % end classdef

由此，我们总共为6个按钮注册了6个回调方法，需要都进行实现，不然触发按钮时，该按钮不会做出响应。简单起见，这里我们以实现设置界面中的SaveBtn的回调方法callbackSaveBtn为例子来说明。

在没有设置APIKey或SecrectKey前，触发SnippingToolBtn或者ImgLoadToolBtn会有先进行设置的提示：

image

callbackSaveBtn方法实现的逻辑：首先由HasSetup属性判断是否进行了APIKey和SecrectKey的设置（初始默认是false没有设置），如果没有设置，会提示没有APIKey或SecrectKey，则需要输入APIKey和SecrectKey的值，然后点击保存按钮，那么后台会将获取到的值存储下来（.mat文件），更新HasSetup的值为true，后续我们就不必要再次输入了，要想更换值的话，点击重置按钮重新配置即可；如果进行了设置（HasSetup属性为true），直接保存即可。

具体的代码如下：

 % --------------------------------------
 % % Callback functions
 % --------------------------------------
 % callbackSaveBtn
 function callbackSaveBtn(app, ~, ~)
 if ~isempty(app.SecrectKeyText.Value) && ~isempty(app.APIKeyText.Value)
 key.apiKeyVal = app.APIKeyText.Value;
 key.secrectKeyVal = app.SecrectKeyText.Value;
 if exist('apikey.mat', 'file')
 delete('apikey.mat');
 end
 save('apikey.mat', 'key');
 !attrib +s +h apikey.mat
 uialert(app.UIFig, 'Save successfully!', 'Confirm', 'Icon', 'success');
 app.APIKeyText.Editable = 'off';
 app.SecrectKeyText.Editable = 'off';
 else
 uialert(app.UIFig, 'API Key or Secrect Key is empty!', 'Confirm', 'Icon', 'warning');
 end % end if
 end % callbackSaveBtn

实现了保存按钮的功能后，就可以得到如下动图所示的效果了。

image

其他的回调函数源代码：

 % clickedSnippingToolBtn
 function clickedSnippingToolBtn(app, ~, ~)
 if ~isempty(app.SecrectKeyText.Value) && ~isempty(app.APIKeyText.Value)
 app.UIFig.Visible = 'off';
 pause(0.1);
 outFileName = 'temp.png';
 cropImg(outFileName);
 !attrib +s +h temp.png
 %
 app.ImgShow.ImageSource = imread(outFileName);
 app.UIFig.Visible = 'on';
 %
 apiURL = 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic';
 words = getWordsByBaiduOCR(outFileName, app.APIKeyVal, app.SecrectKeyVal, '', apiURL, 'MultiLine');
 app.WordsShowTA.Value = words;
 else
 msg = {'API Key or Secrect Key is empty!'; 'Please set it up first!'};
 uialert(app.UIFig, msg, 'Confirm', 'Icon', 'warning');
 end
 end % end clickedSnippingToolBtn 
 % clickedImgLoadToolBtn
 function clickedImgLoadToolBtn(app, ~, ~)
 if ~isempty(app.SecrectKeyText.Value) && ~isempty(app.APIKeyText.Value)
 [fName, fPath] = uigetfile({'.png'; '.jpg'; '.bmp'; '.tif'}, 'Open image');
 if ~isequal(any([fName, fPath]), 0)
 img = imread(strcat(fPath, fName));
 outFileName = 'temp.png';
 if exist(outFileName, 'file')
 delete(outFileName)
 end
 imwrite(img, outFileName);
 !attrib +s +h temp.png
 %
 app.ImgShow.ImageSource = imread(outFileName);
 app.UIFig.Visible = 'on';
 %
 apiURL = 'https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic';
 words = getWordsByBaiduOCR(outFileName, app.APIKeyVal, app.SecrectKeyVal, '', apiURL, 'MultiLine');
 app.WordsShowTA.Value = words;
 else
 return
 end % end if
 else % end if
 msg = {'API Key or Secrect Key is empty!'; 'Please set it up first!'};
 uialert(app.UIFig, msg, 'Confirm', 'Icon', 'warning');
 end
 end % end clickedImgLoadToolBtn 
 % clickedSetupToolBtn
 function clickedSetupToolBtn(app, ~, ~)
 if ~app.HasSetup
 app.ContainerForMain.Visible = 'off';
 app.ContainerForSetup.Visible = 'on';
 app.HasSetup = true;
 else
 app.ContainerForMain.Visible = 'on';
 app.ContainerForSetup.Visible = 'off';
 app.HasSetup = false;
 end
 end % end clickedSetupToolBtn 
 % clickedCleanToolBtn
 function clickedCleanToolBtn(app, ~, ~)
 app.WordsShowTA.Value = '';
 app.ImgShow.ImageSource = '';
 end % end clickedCleanToolBtn 
 % callbackResetBtn
 function callbackResetBtn(app, ~, ~)
 app.APIKeyText.Value = '';
 app.APIKeyText.Editable = 'on';
 app.SecrectKeyText.Value = '';
 app.SecrectKeyText.Editable = 'on';
 end % callbackResetBtn

四、使用演示

现在让我们来测试一下搭建的图像识别工具吧，比如，某麻子同学是一名研究生，在阅读那种扫描版的pdf文献时，想把其中的一段语句复制下来用于记录笔记或者做PPT用，这时我们的工具就派上用场了：

image

刹那间，某麻子同学得到了想要的结果，露出了久违的幸福的一笑！

image

五、结语

至此，我们完成了一个比较完整的文字识别工具！希望您喜欢，并且可以从中获得有用的东西。

本文完整代码，请在GZH内回复“文字识别工具”进行下载。

【往期推荐】

矩阵2-范数化的向量化方法 (qq.com)
texStudio主题配置 (qq.com)
送福利啦 (qq.com)
如何用Matlab一键下载B站高清视频（下） (qq.com)
Python中的装饰器 (qq.com)
MATLAB 风格指南 2.0 (qq.com)
匿名函数（Anonymous Function） (qq.com)
猜猜今天的干货有哪些？ (qq.com)
分享爬取Matlab中文论坛基础讨论的源代码 (qq.com)
如何用Matlab一键下载B站高清视频（上） (qq.com)
爬取某学者主页上的文献 (qq.com)

手把手教你，一个案例学会用Matlab App Designer设计文字识别工具（附源码）

一、前言

二、预备

1. API接口

2. 图像的Base64编码

3. 屏幕截图

4. 调用百度API识别文字

三、工具搭建

1. 功能需求

2. 实现细节

四、使用演示

五、结语

你可能感兴趣的:(手把手教你，一个案例学会用Matlab App Designer设计文字识别工具（附源码）)