论文阅读---REALISE model

REALISE model:

1.utilizes multiple encoders to obtain the semantic ,phonetic , and graphic information to distinguish the similarities of Chinese characters and correct the spelling errors.
2.And then, develop a selective modality fusion module to obtain the context-aware multimodal representations.
3.Finally ,the output layer predict the probabilities of error corrections.


Semantic encoder:

BERT, which provides rich contextual word representation with the unsupervised pretraining on large corpora.

from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')


Phonetic encoder

pinyin: initial(21)+final(39)+tone(5)
hierarchical phonetic encoder :character-level encoder and sentence-level encoder

Character-level encoder

GRU(Gate Recurrent Unit)是循环神经网络(Recurrent Neural Network, RNN)的一种。和LSTM(Long-Short Term Memory)一样,也是为了解决长期记忆和反向传播中的梯度等问题而提出来的。


Sentence-level Encoder: obtain the contextualized phonetic representation for each Chinese characters

4-layer Transformer with the same hidden size as the semantic encoder
because independent phonetic vectors are not distinguished in order, so we add the positional embeading to each vector. +pack the vector together ->transformer layers to calculate the contextualized representation in acoustic modality.

Graphic Encoder

three fonds correpond to the three channels of the character images whose size is set to 32*32 pixel

Selective Modality Fusion Module

Ht, Ha,Hv ==textual ,acoustic,visual
fuse information i n different modalities
selective gate unit: select how much information flow to the mixed multimodal representation.
gate values :fully-connected layer followed by a sigmoid function.

Acoustic and Visual Pretraining

aims to learn the acoustic-textual and visual-textual relationships
phonetic encoder:input method pretraining objective
graphhic encoder:OCP pretraining objective

Data and Metrics

data:SIGHAN —>convert to simplified chinese by using the OPENCC tools

two level :detection and correction level to test the model
