使用深度学习的越南场景文本检测和识别:实证研究

Nhat Truong Pham, Van Dung Pham, Qui Nguyen-Van, Bao Hung Nguyen, Duc Ngoc Minh Dang, Sy Dzung Nguyen
{"title":"使用深度学习的越南场景文本检测和识别:实证研究","authors":"Nhat Truong Pham, Van Dung Pham, Qui Nguyen-Van, Bao Hung Nguyen, Duc Ngoc Minh Dang, Sy Dzung Nguyen","doi":"10.1109/GTSD54989.2022.9989248","DOIUrl":null,"url":null,"abstract":"Scene text detection and recognition are vital challenging tasks in computer vision, which are to detect and recognize sequences of texts in natural scenes. Recently, researchers have investigated a lot of state-of-the-art methods to improve the accuracy and efficiency of text detection and recognition. However, there has been little research on text detection and recognition in natural scenes in Vietnam. In this paper, a deep learning-based empirical investigation of Vietnamese scene text detection and recognition is presented. Firstly, four detection models including differentiable binarization network (DBN), pyramid mask text detector (PMTD), pixel aggregation network (PAN), and Fourier contour embedding network (FCEN), are employed to detect text regions from the images. Then, four text recognition models including convolutional recurrent neural network (CRNN), self-attention text recognition network (SATRN), no-recurrence sequence-to-sequence text recognizer (NRTR), and RobustScanner (RS) are also investigated to recognize the texts. Moreover, data augmentation methods are also applied to enrich data for improving the accuracy and enhancing the performance of scene text detection and recognition. To validate the effectiveness of scene text detection and recognition models, the VinText dataset is employed for evaluation. Empirical results show that PMTD and SATRN achieve the highest scores among the others for text detection and recognition, respectively. For knowledge-sharing, our implementation is publicly available at https://github.com/ThorPham/VN_scene_text_detection_recognition.","PeriodicalId":125445,"journal":{"name":"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)","volume":"2023 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vietnamese Scene Text Detection and Recognition using Deep Learning: An Empirical Study\",\"authors\":\"Nhat Truong Pham, Van Dung Pham, Qui Nguyen-Van, Bao Hung Nguyen, Duc Ngoc Minh Dang, Sy Dzung Nguyen\",\"doi\":\"10.1109/GTSD54989.2022.9989248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scene text detection and recognition are vital challenging tasks in computer vision, which are to detect and recognize sequences of texts in natural scenes. Recently, researchers have investigated a lot of state-of-the-art methods to improve the accuracy and efficiency of text detection and recognition. However, there has been little research on text detection and recognition in natural scenes in Vietnam. In this paper, a deep learning-based empirical investigation of Vietnamese scene text detection and recognition is presented. Firstly, four detection models including differentiable binarization network (DBN), pyramid mask text detector (PMTD), pixel aggregation network (PAN), and Fourier contour embedding network (FCEN), are employed to detect text regions from the images. Then, four text recognition models including convolutional recurrent neural network (CRNN), self-attention text recognition network (SATRN), no-recurrence sequence-to-sequence text recognizer (NRTR), and RobustScanner (RS) are also investigated to recognize the texts. Moreover, data augmentation methods are also applied to enrich data for improving the accuracy and enhancing the performance of scene text detection and recognition. To validate the effectiveness of scene text detection and recognition models, the VinText dataset is employed for evaluation. Empirical results show that PMTD and SATRN achieve the highest scores among the others for text detection and recognition, respectively. For knowledge-sharing, our implementation is publicly available at https://github.com/ThorPham/VN_scene_text_detection_recognition.\",\"PeriodicalId\":125445,\"journal\":{\"name\":\"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)\",\"volume\":\"2023 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GTSD54989.2022.9989248\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GTSD54989.2022.9989248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

场景文本检测与识别是计算机视觉领域中具有挑战性的课题,主要是对自然场景中的文本序列进行检测和识别。近年来,研究者们研究了许多最新的方法来提高文本检测和识别的准确性和效率。然而,关于越南自然场景文本检测与识别的研究却很少。本文对基于深度学习的越南语场景文本检测与识别进行了实证研究。首先,采用可微二值化网络(DBN)、金字塔掩码文本检测器(PMTD)、像素聚合网络(PAN)和傅立叶轮廓嵌入网络(FCEN)四种检测模型对图像中的文本区域进行检测;然后,研究了卷积递归神经网络(CRNN)、自关注文本识别网络(SATRN)、无递归序列到序列文本识别器(NRTR)和鲁棒扫描器(RS)四种文本识别模型对文本进行识别。此外,还采用了数据增强方法来丰富数据,以提高准确率,增强场景文本检测和识别的性能。为了验证场景文本检测和识别模型的有效性,使用VinText数据集进行评估。实证结果表明,PMTD和SATRN分别在文本检测和识别方面取得了最高的分数。为了知识共享,我们的实现可以在https://github.com/ThorPham/VN_scene_text_detection_recognition上公开获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Vietnamese Scene Text Detection and Recognition using Deep Learning: An Empirical Study
Scene text detection and recognition are vital challenging tasks in computer vision, which are to detect and recognize sequences of texts in natural scenes. Recently, researchers have investigated a lot of state-of-the-art methods to improve the accuracy and efficiency of text detection and recognition. However, there has been little research on text detection and recognition in natural scenes in Vietnam. In this paper, a deep learning-based empirical investigation of Vietnamese scene text detection and recognition is presented. Firstly, four detection models including differentiable binarization network (DBN), pyramid mask text detector (PMTD), pixel aggregation network (PAN), and Fourier contour embedding network (FCEN), are employed to detect text regions from the images. Then, four text recognition models including convolutional recurrent neural network (CRNN), self-attention text recognition network (SATRN), no-recurrence sequence-to-sequence text recognizer (NRTR), and RobustScanner (RS) are also investigated to recognize the texts. Moreover, data augmentation methods are also applied to enrich data for improving the accuracy and enhancing the performance of scene text detection and recognition. To validate the effectiveness of scene text detection and recognition models, the VinText dataset is employed for evaluation. Empirical results show that PMTD and SATRN achieve the highest scores among the others for text detection and recognition, respectively. For knowledge-sharing, our implementation is publicly available at https://github.com/ThorPham/VN_scene_text_detection_recognition.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信