A. Brakensiek, A. Kosmala, D. Willett, Wenwei Wang, G. Rigoll
{"title":"Performance evaluation of a new hybrid modeling technique for handwriting recognition using identical on-line and off-line data","authors":"A. Brakensiek, A. Kosmala, D. Willett, Wenwei Wang, G. Rigoll","doi":"10.1109/ICDAR.1999.791820","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791820","url":null,"abstract":"The paper deals with the performance evaluation of a novel hybrid approach to large vocabulary cursive handwriting recognition and contains various innovations. 1) It presents the investigation of a new hybrid approach to handwriting recognition, consisting of hidden Markov models (HMMs) and neural networks trained with a special information theory based training criterion. This approach has only been recently introduced successfully to online handwriting recognition and is now investigated for the first time for offline recognition. 2) The hybrid approach is extensively compared to traditional HMM modeling techniques and the superior performance of the new hybrid approach is demonstrated. 3) The data for the comparison has been obtained from a database containing online handwritten data which has been converted to offline data. Therefore, a multiple evaluation has been carried out, incorporating the comparison of different modeling techniques and the additional comparison of each technique for online and offline recognition, using a unique database. The results confirm that online recognition leads to better recognition results due to the dynamic information of the data, but also show that it is possible to obtain recognition rates for offline recognition that are close to the results obtained for online recognition. Furthermore, it can be shown that for both online and offline recognition, the new hybrid approach clearly outperforms the competing traditional HMM techniques. It is also shown that the new hybrid approach yields superior results for the offline recognition of machine printed multifont characters.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123132576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting indexing keywords from image structures in engineering drawings","authors":"T. Syeda-Mahmood","doi":"10.1109/ICDAR.1999.791827","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791827","url":null,"abstract":"A critical operation in the creation of databases of electronic versions of scanned engineering drawings is the automatic extraction of indexing text information from image structures, called title blocks. This paper addresses the problem of locating title block regions and the subsequent extraction of indexing keywords from such regions. A general technique of 2D pattern localization in unsegmented images, called location hashing, is used to locate title blocks. An engineering drawing indexing system then combines title block localization with text recognition to enable indexing text keyword extraction.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116781715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On-line correction of Web pages","authors":"H. Richy, G. Lorette","doi":"10.1109/ICDAR.1999.791854","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791854","url":null,"abstract":"This paper describes a pen-interface for correcting digital documents. By using conventional gestures, the human corrector annotates directly the digital document on a display tablet with an electronic stylus. As the document is immediately updated by an interactive editor this provides direct visual feedback that the correction has been carried out.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116948358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Symbolic representation and distributed matching strategies for schematics","authors":"M. Takatsuka, T. Caelli, G. West, S. Venkatesh","doi":"10.1109/ICDAR.1999.791882","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791882","url":null,"abstract":"This paper describes object-centered symbolic representation and distributed matching strategies of 3D objects in a schematic form which occur in engineering drawings and maps. The object-centered representation has a hierarchical structure and is constructed from symbolic representations of schematics. With this representation, two independent schematics representing the same object can be matched. We also consider matching strategies using distributed algorithms. The object recognition is carried out with two matching methods: (1) matching between an object model and observed data at the lowest level of the hierarchy, and (2) constraints propagation. The first is carried out with symbolic Hopfield-type neural networks and the second is achieved via hierarchical winner-takes-all algorithms.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121358936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced state clustering for very large vocabulary HMM-based on-line handwriting recognition","authors":"A. Kosmala, D. Willett, G. Rigoll","doi":"10.1109/ICDAR.1999.791819","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791819","url":null,"abstract":"The paper presents some novel methods for the introduction of context dependent hidden Markov models (HMM) to online handwriting recognition. The use of these so-called n-graphs can lead to substantially improved modeling accuracy, but requires some intelligent parameter reduction methods (state clustering). This is especially the case for the investigated very large vocabulary system, incorporating an active vocabulary of 200000 words. Switching from context independent models to context dependent models-considering the underlying vocabulary-yields in the worst case to 25000 HMMs and very poor trainability for most of the introduced models. Therefore, the conducted investigations are focused on an appropriate state clustering method which is supported by decision trees and some new self organizing approaches to generate the required trees. The presented comparison takes also the different context dependencies (left, right or both sides) into consideration.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125133592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Viswanathan, H. Beigi, S. Dharanipragada, A. Tritschler
{"title":"Retrieval from spoken documents using content and speaker information","authors":"M. Viswanathan, H. Beigi, S. Dharanipragada, A. Tritschler","doi":"10.1109/ICDAR.1999.791851","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791851","url":null,"abstract":"There has been a recent upsurge in the deployment of emerging technologies such as speech and speaker recognition which are reaching maturity. We discuss the details of the components required to build a system for audio indexing and retrieval for spoken documents using content and speaker based information facilitated by speech and speaker recognition. The real power of spoken document analysis is in using both content and speaker information together in retrieval by combining the results. The experiments described here are in the broadcast news domain, but the underlying techniques can easily be extended to other speech-centric applications and transactions.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125343461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Document transport, transfer and exchange: security and commercial aspects","authors":"Octavian Ureche, R. Plamondon","doi":"10.1109/ICDAR.1999.791855","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791855","url":null,"abstract":"This paper presents a brief overview of several technical issues concerning security and commercial aspects of the electronic handling and transfer of documents. After introducing a few basic notions and general definitions, we present a new digital envelope transfer protocol-based system, named TRANZIX, used in the context of generic \"transport and transfer of value\". We present the general concept of the innovative application that allows commercial exchange of electronic documents through digital transactions using customized electronic money. The system provides very strong protection based on public-key cryptography complemented with biometric protection based on handwritten signatures. Finally, a few targeted applications are overviewed, including the customized exchanges of documents through \"mixed transfers of value\".","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121927262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-experts for touching digit string recognition","authors":"Xian Wang, Venu Govindaraju, S. Srihari","doi":"10.1109/ICDAR.1999.791909","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791909","url":null,"abstract":"84.6% of touching digit strings have only two digits touching, 12.3% have three digits touching and 3.1% have more than three digits touching. We present a multi-expert approach to recognize touching digit pairs (TDP) and touching digit triples (TDT). We combine holistic and traditional segmentation methods. 25,686 TDP training samples and 2,778 TDP testing samples collected from USPS mail are used in our experiment. The holistic method outperforms the traditional segmentation-based methods. The multi-expert combination has the best performance: a correct recognition rate of 91.1% on TDP.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122040575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
U. Gargi, David J. Crandall, Sameer Kiran Antani, T. Gandhi, Ryan Keener, R. Kasturi
{"title":"A system for automatic text detection in video","authors":"U. Gargi, David J. Crandall, Sameer Kiran Antani, T. Gandhi, Ryan Keener, R. Kasturi","doi":"10.1109/ICDAR.1999.791717","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791717","url":null,"abstract":"Video indexing is an important problem that has occupied recent research efforts. The text appearing in video can provide semantic information about the scene content. Detecting and recognizing text events can provide indices into the video for content based querying. We describe a system for detecting, tracking, and extracting artificial and scene text in MPEG-1 video. Preliminary results are presented.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129586753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Heeding more than the top template","authors":"Prateek Sarkar, G. Nagy","doi":"10.1109/ICDAR.1999.791804","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791804","url":null,"abstract":"We present a method of classifying a pattern using information furnished by a ranked list of templates, rather than just the best matching template. We propose a parsimonious model to compute the class-conditional likelihood of a list of templates ranked on the basis of their match scores. We discuss the estimation of parameters used in the model. The results of maximum likelihood classification on isolated digit patterns consistently show a 10-20% relative gain in recognition accuracy when we use more than one top-template.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130332991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}