Curriculum Learning for Handwritten Text Line Recognition

2014 11th IAPR International Workshop on Document Analysis Systems Pub Date : 2013-12-05 DOI:10.1109/DAS.2014.38

J. Louradour, Christopher Kermorvant

引用次数: 21

Abstract

Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition. At the same time, learning RNN by gradient descent leads to slow convergence, and training times are particularly long when the training database consists of full lines of text. In this paper, we propose an easy way to accelerate stochastic gradient descent in this set-up, and in the general context of learning to recognize sequences. The principle is called Curriculum Learning, or shaping. The idea is to first learn to recognize short sequences before training on all available training sequences. Experiments on three different handwritten text databases (Rimes, IAM, OpenHaRT) show that a simple implementation of this strategy can significantly speed up the training of RNN for Text Recognition, and even significantly improve performance in some cases.

查看原文本刊更多论文

手写体文本行识别课程学习

近年来，递归神经网络(RNN)在离线手写文本识别中取得了最好的成绩。同时，采用梯度下降法学习RNN会导致收敛速度慢，当训练数据库由整行文本组成时，训练时间特别长。在本文中，我们提出了一种简单的方法来加速这种设置中的随机梯度下降，以及在学习识别序列的一般背景下。这一原则被称为课程学习或塑造。这个想法是在对所有可用的训练序列进行训练之前，首先学会识别短序列。在三个不同的手写文本数据库(Rimes, IAM, OpenHaRT)上的实验表明，该策略的简单实现可以显著加快RNN用于文本识别的训练速度，在某些情况下甚至可以显著提高性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 11th IAPR International Workshop on Document Analysis Systems

自引率

0.00%

发文量