Word-Graph Based Handwriting Key-Word Spotting: Impact of Word-Graph Size on Performance

2014 11th IAPR International Workshop on Document Analysis Systems Pub Date : 2014-04-07 DOI:10.1109/DAS.2014.65

A. Rossi, E. Vidal

{"title":"Word-Graph Based Handwriting Key-Word Spotting: Impact of Word-Graph Size on Performance","authors":"A. Rossi, E. Vidal","doi":"10.1109/DAS.2014.65","DOIUrl":null,"url":null,"abstract":"Key-Word Spotting (KWS) in handwritten documents is approached here by means of Word Graphs (WG) obtained using segmentation-free handwritten text recognition technology based on N-gram Language Models and Hidden Markov Models. Linguistic context significantly boost KWS performance with respect to methods which ignore word contexts and/or rely on image-matching with pre-segmented isolated words. On the other hand, WG-based KWS can be significantly faster than other KWS approaches which directly work on the original images where, in general, computational demands are exceedingly high. A large WG contains most of the relevant information of the original text (line) image needed for KWS but, if it is too large, the computational advantages over traditional, image matching-based KWS become diminished. Conversely, if it is too small, relevant information may be lost, leading to degraded KWS precision/recall performance. We study the trade off between WG size and KWS information retrieval performance. Results show that small, computationally cheap WGs can be used without loosing the excellent KWS performance achieved with huge WGs.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"27 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 11th IAPR International Workshop on Document Analysis Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAS.2014.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Key-Word Spotting (KWS) in handwritten documents is approached here by means of Word Graphs (WG) obtained using segmentation-free handwritten text recognition technology based on N-gram Language Models and Hidden Markov Models. Linguistic context significantly boost KWS performance with respect to methods which ignore word contexts and/or rely on image-matching with pre-segmented isolated words. On the other hand, WG-based KWS can be significantly faster than other KWS approaches which directly work on the original images where, in general, computational demands are exceedingly high. A large WG contains most of the relevant information of the original text (line) image needed for KWS but, if it is too large, the computational advantages over traditional, image matching-based KWS become diminished. Conversely, if it is too small, relevant information may be lost, leading to degraded KWS precision/recall performance. We study the trade off between WG size and KWS information retrieval performance. Results show that small, computationally cheap WGs can be used without loosing the excellent KWS performance achieved with huge WGs.

查看原文本刊更多论文

基于词图的手写关键词识别:词图大小对性能的影响

本文利用基于N-gram语言模型和隐马尔可夫模型的无分割手写文本识别技术所获得的词图来解决手写文档中的关键词识别问题。相对于忽略单词上下文和/或依赖于预先分割的孤立单词的图像匹配的方法，语言上下文显著提高了KWS的性能。另一方面，基于wg的KWS可以比直接处理原始图像的其他KWS方法快得多，通常情况下，原始图像的计算需求非常高。一个大的工作组包含了KWS所需的原始文本(行)图像的大部分相关信息，但是，如果它太大，与传统的基于图像匹配的KWS相比，计算优势就会减弱。相反，如果它太小，可能会丢失相关信息，导致KWS精度/召回性能下降。我们研究了工作组大小和KWS信息检索性能之间的权衡。结果表明，小型、计算成本低廉的WGs可以在不损失大型WGs所取得的优异KWS性能的情况下使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 11th IAPR International Workshop on Document Analysis Systems

自引率

0.00%

发文量