{"title":"An EfficientNet-like Feature Extractor and Focal CTC Loss for Image-base Sequence Recognition","authors":"D. V. Sang, N. Thuan","doi":"10.1109/NICS51282.2020.9335861","DOIUrl":null,"url":null,"abstract":"Image-based sequence recognition is an interesting topic in computer vision, which has various potential applications in real life. This paper proposes a novel convolutional-recurrent neural network (CRNN) for image-based sequence recognition. Particularly, we introduce a new convolutional backbone network for feature extraction based on the EfficientNet architecture and use focal CTC loss to train the network. Our method beats several existing state-of-the-art methods on the ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction (SROIE) challenge and the IAM handwriting dataset. The experimental results show that our method yields an F1 score equivalent to the top 2 on the ICDAR 2019 SROIE challenge.","PeriodicalId":308944,"journal":{"name":"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS51282.2020.9335861","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Image-based sequence recognition is an interesting topic in computer vision, which has various potential applications in real life. This paper proposes a novel convolutional-recurrent neural network (CRNN) for image-based sequence recognition. Particularly, we introduce a new convolutional backbone network for feature extraction based on the EfficientNet architecture and use focal CTC loss to train the network. Our method beats several existing state-of-the-art methods on the ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction (SROIE) challenge and the IAM handwriting dataset. The experimental results show that our method yields an F1 score equivalent to the top 2 on the ICDAR 2019 SROIE challenge.