Rare Chinese Character Recognition by Radical Extraction Network 笔记

声明：仅翻译部分内容，若阅读体验不佳，还请见谅

摘要：

首先提取和识别基础的Graphical components。
在这篇论文里面提出了新的Radical Extraction Network。使用CNN提取和识别Radicals
首先在常见的Chinese characters里面学习到识别不同的Radicals，然后迁移学习到的deep appearance models到常用的Chinese characters上面。

1 Introduction

Opitcal Character Recognition(OCR)
中文识别起来比较困难(中文OCR比较困难)是因为中文字比较多而且字与字之间比较相似。

Chinese characters are formed by a combination of radicals（中文字由部首组成）

takes as input the feature maps(以feature maps作为输入)

不同于传统的方法经常需要对齐的radical-level训练图片作来实现识别不同radicals的功能，we learn to localize in a weakly supervised fashion：在训练过程中只用到了character-level（字级别）的图片。

weakly supervised object detection(WSD)弱监督目标检测

REN has three data streams: 1 a radical-level classification stream to classify different radicals,2 a radical-level detection stream to select positive candidate bounding box that tightly contain a particular radical,3 and a character-level classification stream to classify different Chinese characters based on radical-level recognition results.(偏旁部首级别的分分类，偏旁部首级别的目标检测，字级别的分类)

整个过程端到端训练，训练过程中只需要字级别的图片，REN被训练以自动地从字级别的annotations（标注？）中提取和检测不同的radicals。

REN可以以较高准确率识别出radicals，并且提高了Chinese characters的识别准确率。

2 Method

Architecture of Radical Extraction Network

WSDDN is a state-of-art weakly supervised object detection method。REN has one more stream than WSDDN to perform classification on character-level.

ROI pooling layer:
输入：
以及region set
输出：

where the is the dimension of pooled representation of each bounding box.

a radical-level classification stream

矩阵被几个全连接网络处理，并且每个区域(region)分别被映射到一个维向量。这些全连接网络输出矩阵
，之后一个row-wise softmax operator被应用到上面。该数据流的最终输出为：

b Radical-level detection data stream

The aim of this data stream is to select a best bounding box for every radical.
该数据流始于被池化的表示矩阵。我们通过几个全连接网络将每个region映射到一个向量。这些全连接网络输出一个score matrix，之后一个column-wise softmax operator 将被加之于上。在第一个(?)数据流里面我们不会让这些层之间共享权重系数。该数据流的最终输出由下式给出：

The radical score is obtained by combining and :

其中表示各对应元素相乘( element-wise product operator). 考虑到中的每个元素都在 (0, 1)中取值，我们将视为字包含第个radical的置信度(confidence)。

c Character-level classification data stream

The aim of this stream is to obtain the final character-level classification score.我们基于以下信息对一个中文字做分类：1)中文图片本身以及，2）从图片中识别出的偏旁部首。图片本身可以提供必要的global context，从中识别出的偏旁部首则可以捕获到字的内部结构。在该数据流中我们融合了以上两种信息。

该数据流始于卷积feature map ，并通过几个全连接网络将其映射到一个的global context 向量。之后，再在上面施加一个linear map，再追加一个softmax operator：

where is the final character-level classification score, , are weights to be learned, and .

Training REN

training data:
charcter-level labels:
where .
我们使用Edge Boxes从中提取了大约B个bounding box，由此构成的集合记为。更进一步，我们可以构造一个character-radical correspondence matrix ，以表示一个character是否包含一个特定的radical。注意到该矩阵与训练集的大小无关，因此容易获得。基于我们可以为构造一个radical-level的标签，以表示某一特定的radical是否在中。

$J_{rad}(\theta)=-\frac{1}{N} \sum_{i=1}^N \sum_{j=1}^{C_{rad}} \mathbf{1} \left\{y_i^{rad}=1\right\}log[\phi^{rad}(x_i, \mathcal{R}_i; \theta)]_j -\frac{1}{N} \sum_{i=1}^N \sum_{j=1}^{C_{rad}} \mathbf{1} \left\{y_i^{rad}=0\right\}log(1-[\phi^{rad}(x_i, \mathcal{R}_i; \theta)]_j)$

TBC

Rare Chinese Character Recognition by Radical Extraction Network 笔记

Rare Chinese Character Recognition by Radical Extraction Network 笔记

摘要：

1 Introduction

2 Method

a radical-level classification stream

b Radical-level detection data stream

c Character-level classification data stream

Training REN

你可能感兴趣的:(Rare Chinese Character Recognition by Radical Extraction Network 笔记)