Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs

FCNs


1.abstract
提出了FCNN,The FCN is trained to optimize a continuous version of the Pseudo F-measure metric and an ensemble of FCNs outperform the competition winners on 4 of 7 DIBCO competitions.
指出这个模型在Palm Leaf Manuscripts里面效果也很好

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs_第1张图片
2.introduction
指出在识别文档之前应该先二值化
我们提出的FCNN能对各种文档图片作二值化并且不需要调参
回顾其它传统方法,指出FCNs learn from training data to exploit the spatial arrangements of pixels without relying on a handcrafted bias on local shapes.
本文贡献:
1.提出FCNs和架构在二值化上的使用
2.We show that directly optimizing the proposed continuous Pseudo F-measure exceeds the previous state-of-the-art on DIBCO competition data.
3.通过计算学习曲线得出结论:数据多样性比数据质量重要
4.证明了在数据输入特征更多的时候,FCNN表现更出色

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs_第2张图片
3.related work
总结了前人的方法
4.methods
A.FULLY CONVOLUNTIONAL NETWORKS
介绍了结构,各个输入输出尺寸和relu函数
B.MULTI-SCALE
提出不同尺寸输入合并能提高性能效果
介绍了上采样和下采样结构
C.Pseudo F-measure Loss
看不懂
D.Datasets and Metrics
用了两个数据集 DIBCOs [1]–[7] and Palm Leaf Manuscripts (PLM)
E.Implementation Details
看不懂,讲了下训练细节

5.experiments
A.loss functions
用了四个损失函数:P-FM, FM, P-FM + FM, and Cross Entropy (CE)
其中,由于P-FM有点问题(predicting border pixels as background)?就用了P-FM+FM一起,CE loss is a standard classification based loss
B.DIBCO performance
将自己FCNs与冠军队伍作了对比
C.Architecture Search
作者更改了一些结构的超参数(depth,width,kernel,scale),发现效果提升不大,认为对于数据集的更改比结构更改更好
D.How Much Data is Enough?
数据增多可能会提高效果,但太多会适得其反,绘制了关于数据量的学习曲线
提出多样性的training集至关重要
E. Input Features
指出Relative Darkness特征的重要性
We used RD features with a window size of 5x5 and a similarity threshold of ±10 in all experiments in this paper.

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs_第3张图片

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs_第4张图片

6.conclusion
首先,我们结合P-FM和FM损失函数,并用FCNs训练文档图片二值化
其次,指出训练集的多样性比其它重要很多
最后,we analyzed using additional features as input to the FCN and found that Relative Darkness features [26] and the output of Howe binarization [9] perform best.

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs_第5张图片

你可能感兴趣的:(l)