M. Opitz, Markus Diem, Stefan Fiel, Florian Kleber, Robert Sablatnig
{"title":"End-to-End Text Recognition Using Local Ternary Patterns, MSER and Deep Convolutional Nets","authors":"M. Opitz, Markus Diem, Stefan Fiel, Florian Kleber, Robert Sablatnig","doi":"10.1109/DAS.2014.29","DOIUrl":null,"url":null,"abstract":"Text recognition in natural scene images is an application for several computer vision applications like licence plate recognition, automated translation of street signs, help for visually impaired people or image retrieval. In this work an end-to-end text recognition system is presented. For detection an AdaBoost ensemble with a modified Local Ternary Pattern (LTP) feature-set with a post-processing stage build upon Maximally Stable Extremely Region (MSER) is used. The text recognition is done using a deep Convolution Neural Network (CNN) trained with backpropagation. The system presented outperforms state of the art methods on the ICDAR 2003 dataset in the text-detection (F-Score: 74.2%), dictionary-driven cropped-word recognition (F-Score: 87.1%) and dictionary-driven end-to-end recognition (F-Score: 72.6%) tasks.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 11th IAPR International Workshop on Document Analysis Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAS.2014.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35
Abstract
Text recognition in natural scene images is an application for several computer vision applications like licence plate recognition, automated translation of street signs, help for visually impaired people or image retrieval. In this work an end-to-end text recognition system is presented. For detection an AdaBoost ensemble with a modified Local Ternary Pattern (LTP) feature-set with a post-processing stage build upon Maximally Stable Extremely Region (MSER) is used. The text recognition is done using a deep Convolution Neural Network (CNN) trained with backpropagation. The system presented outperforms state of the art methods on the ICDAR 2003 dataset in the text-detection (F-Score: 74.2%), dictionary-driven cropped-word recognition (F-Score: 87.1%) and dictionary-driven end-to-end recognition (F-Score: 72.6%) tasks.