Handwritten Hindi Word Generation to enable Few Instance Learning of Hindi Documents

2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI:10.1109/SPCOM50965.2020.9179634

G. Senthil, K. Nandhakumar, G. R. S. Subrahmanyam

引用次数: 0

Abstract

Handwritten Text Recognition (HTR) of Hindi Documents is a challenging research problem of interest which could enable digitization of millions of official documents. Due to challenges in character segmentation, Segmentation-free Word Recognition is the preferred approach. Lack of a large, diverse Hindi Handwritten Word dataset for pre-training deep learning architectures is a pressing issue. In this paper, we propose a novel way of generating diverse Handwritten Hindi Word images using only Handwritten Hindi Characters and further analyze its effectiveness in enabling Few Instance Learning of Handwritten Hindi Documents.

查看原文本刊更多论文

手写体印地语单词生成，使印地语文档的少数实例学习

印地语文档的手写文本识别(HTR)是一个具有挑战性的研究问题，它可以实现数百万官方文件的数字化。由于字符分割的挑战，无分割词识别是首选的方法。缺乏一个大型的、多样化的印地语手写词数据集来预训练深度学习架构是一个紧迫的问题。在本文中，我们提出了一种仅使用手写印地语字符生成多样化手写印地语单词图像的新方法，并进一步分析了其在实现手写印地语文档的少实例学习方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 International Conference on Signal Processing and Communications (SPCOM)

自引率

0.00%

发文量