2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)最新文献_第3页

Weakly Supervised Text Attention Network for Generating Text Proposals in Scene Images 用于场景图像文本建议生成的弱监督文本注意网络

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.61

Li Rong, En MengYi, Liang Jianqiang, Zhang Haibin

{"title":"Weakly Supervised Text Attention Network for Generating Text Proposals in Scene Images","authors":"Li Rong, En MengYi, Liang Jianqiang, Zhang Haibin","doi":"10.1109/ICDAR.2017.61","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.61","url":null,"abstract":"Detection and recognition of textual information in scene images is useful but challenging tasks. Numerous methods have been proposed to solve the problem. Recently the best results are attained by deep neural network based methods. Training such networks needs large amounts of bounding box-level or pixel-level annotated data. Generating large amounts of such data always requires huge amounts of labor which can be expensive and time consuming. In this paper we explore the utilization of weakly supervised deep neural network for generating text proposals in natural scene images. The network allows multi-scale inputs and is trained to perform whole image binary classification to tell whether an image contains text or not. After training the network acquired learning of powerful discriminated features that are capable of distinguishing text from other objects. To get the text location, text confidence score map is generated based on feature maps from the top two convolutional layers by extracting class activation map. Value of each pixel in the score map denotes the confidence score of whether the pixel belongs to text or not. By setting a threshold the score map is converted to a binary mask map. Foregrounds of the mask map are probable text areas. Then Maximally Stable Extremal Regions (MSERs) are extracted from these probable text areas and are aggregated as groups. By processing these groups, text proposals are obtained. Experimental results show that without using any bounding boxes or pixel-level annotation, the algorithm achieves recall rate comparable to some fully supervised methods in ICDAR 2013 focused text dataset and In ICDAR 2015 incidental text dataset.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125557140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Deep Strip-Based Network with Cascade Learning for Scene Text Localization 基于深度条形网络的级联学习场景文本定位

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.140

Dao Wu, Rui Wang, Pengwen Dai, Yueying Zhang, Xiaochun Cao

引用次数: 12

A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network 基于对称性的鲁棒场景/视频文本神经网络检测方法

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.206

Yirui Wu, Wenhai Wang, P. Shivakumara, Tong Lu

{"title":"A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network","authors":"Yirui Wu, Wenhai Wang, P. Shivakumara, Tong Lu","doi":"10.1109/ICDAR.2017.206","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.206","url":null,"abstract":"Text detection in video/scene images has gained a significant attention in the field of image processing and document analysis due to the inherent challenges caused by variations in contrast, orientation, background, text type, font type, non-uniform illumination and so on. In this paper, we propose a novel text detection method to explore symmetry property and appearance features of text for improved accuracy and robustness. First, the proposed method explores Extremal Regions (ER) for detecting text candidates in images. Then we propose a novel feature named as Multi-domain Strokes Symmetry Histogram (MSSH) for each text candidate, which describes the inherent symmetry property of stroke pixel pairs in gray, gradient and frequency domains. Furthermore, deep convolutional features are extracted to describe the appearance for each text candidate. We further fuse them by Auto-Encoder network to define a more discriminative text descriptor for classification. Finally, the proposed method constructs text lines based on the classification results. We demonstrate the effectiveness and robustness detection results of our proposed method by testing on four different benchmark databases.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114955146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Scene Text Relocation with Guidance 场景文本重新定位与指导

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.212

Anna Zhu, S. Uchida

引用次数: 7

Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition 门控卷积递归神经网络用于多语言手写识别

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.111

Théodore Bluche, Ronaldo O. Messina

引用次数: 100

Histogram of Exclamation Marks and Its Application for Comics Analysis 感叹号直方图及其在漫画分析中的应用

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.294

Sotaro Hiroe, S. Hotta

引用次数: 1

A Machine Learning System for Assisting Neophyte Researchers in Digital Libraries 协助数位图书馆新手研究人员的机器学习系统

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.60

Bissan Audeh, M. Beigbeder, C. Largeron

引用次数: 3

Comic Characters Detection Using Deep Learning 基于深度学习的漫画角色检测

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.290

Nhu-Van Nguyen, Christophe Rigaud, J. Burie

引用次数: 30

Semi-Supervised Transfer Learning for Convolutional Neural Network Based Chinese Character Recognition 基于卷积神经网络的半监督迁移学习汉字识别

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.79

Yejun Tang, Bing Wu, Liangrui Peng, Changsong Liu

{"title":"Semi-Supervised Transfer Learning for Convolutional Neural Network Based Chinese Character Recognition","authors":"Yejun Tang, Bing Wu, Liangrui Peng, Changsong Liu","doi":"10.1109/ICDAR.2017.79","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.79","url":null,"abstract":"Although transfer learning has aroused researchers' great interest, how to utilize the unlabeled data is still an open and important problem in this area. We propose a novel semi-supervised transfer learning (STL) method by incorporating Multi-Kernel Maximum Mean Discrepancy (MK-MMD) loss into the traditional fine-tuned Convolutional Neural Network (CNN) transfer learning framework for Chinese character recognition. The proposed method includes three steps. First, a CNN model is trained by massive labeled samples in the source domain. Then the CNN model is fine-tuned by a few labeled samples in the target domain. Finally, the CNN model is trained with both a large number of unlabeled samples and the limited labeled samples in the target domain to minimize the MK-MMD loss. Experiments investigate detailed configurations and parameters of the proposed STL method with several frequently used CNN structures including AlexNet, GoogLeNet, and ResNet. Experimental results on practical Chinese character transfer learning tasks, such as Dunhuang historical Chinese character recognition, indicate that the proposed method can significantly improve recognition accuracy in the target domain.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128428796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Radical-Based Chinese Character Recognition via Multi-Labeled Learning of Deep Residual Networks 基于深度残差网络多标签学习的汉字识别

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI: 10.1109/ICDAR.2017.100

Tie-Qiang Wang, Fei Yin, Cheng-Lin Liu

引用次数: 30