利用参数框架对场景文本图像超分辨率中的场景文本识别进行综合研究

2024 IEEE International Conference on Consumer Electronics (ICCE) Pub Date : 2024-01-06 DOI:10.1109/ICCE59016.2024.10444229

Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki

{"title":"利用参数框架对场景文本图像超分辨率中的场景文本识别进行综合研究","authors":"Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki","doi":"10.1109/ICCE59016.2024.10444229","DOIUrl":null,"url":null,"abstract":"Scene Text Recognition (STR) is a technique to detect and recognize text in images. Predicting text in real-world scene images is challenging due to various uncontrollable environmental factors. State-of-the-art text detection and recognition models leverage deep learning and Transformer architectures, consequently achieving impressive accuracy on benchmark datasets. However, challenges persist in accurately processing text within real-world images, often due to unseen data or limited datasets. Both the limitations of STR and the quality of scene text images are crucial factors. Recently, a parametric weight and multiple parametric regularizations were proposed to improve the quality of real-world scene text images. Different from previous surveys in this area, this study has three main objectives. Firstly, to confirm the performance of parametric methods, the text recognition accuracy between with and without methods is compared by using different STR methods. Second, to make the comprehensive experiments, the outcomes of each STR method are compared to show their prediction performances. Third, several existing challenges and research directions are discussed.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"95 6","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comprehensive Study of Scene Text Recognition in Scene Text Image Super-Resolution with Parametric Frameworks\",\"authors\":\"Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki\",\"doi\":\"10.1109/ICCE59016.2024.10444229\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scene Text Recognition (STR) is a technique to detect and recognize text in images. Predicting text in real-world scene images is challenging due to various uncontrollable environmental factors. State-of-the-art text detection and recognition models leverage deep learning and Transformer architectures, consequently achieving impressive accuracy on benchmark datasets. However, challenges persist in accurately processing text within real-world images, often due to unseen data or limited datasets. Both the limitations of STR and the quality of scene text images are crucial factors. Recently, a parametric weight and multiple parametric regularizations were proposed to improve the quality of real-world scene text images. Different from previous surveys in this area, this study has three main objectives. Firstly, to confirm the performance of parametric methods, the text recognition accuracy between with and without methods is compared by using different STR methods. Second, to make the comprehensive experiments, the outcomes of each STR method are compared to show their prediction performances. Third, several existing challenges and research directions are discussed.\",\"PeriodicalId\":518694,\"journal\":{\"name\":\"2024 IEEE International Conference on Consumer Electronics (ICCE)\",\"volume\":\"95 6\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2024 IEEE International Conference on Consumer Electronics (ICCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCE59016.2024.10444229\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE International Conference on Consumer Electronics (ICCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE59016.2024.10444229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

场景文本识别（STR）是一种检测和识别图像中文本的技术。由于各种不可控的环境因素，预测真实世界场景图像中的文本具有挑战性。最先进的文本检测和识别模型利用深度学习和变换器架构，因此在基准数据集上取得了令人印象深刻的准确性。然而，由于未见数据或数据集有限，在准确处理真实世界图像中的文本方面仍然存在挑战。STR 的局限性和场景文本图像的质量都是关键因素。最近，有人提出了参数权重和多重参数正则化来提高真实世界场景文本图像的质量。与以往该领域的研究不同，本研究有三个主要目标。首先，为了证实参数化方法的性能，使用不同的 STR 方法比较了使用参数化方法和不使用参数化方法的文本识别准确率。其次，为了进行综合实验，比较了每种 STR 方法的结果，以显示其预测性能。第三，讨论了现有的几个挑战和研究方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Comprehensive Study of Scene Text Recognition in Scene Text Image Super-Resolution with Parametric Frameworks

Scene Text Recognition (STR) is a technique to detect and recognize text in images. Predicting text in real-world scene images is challenging due to various uncontrollable environmental factors. State-of-the-art text detection and recognition models leverage deep learning and Transformer architectures, consequently achieving impressive accuracy on benchmark datasets. However, challenges persist in accurately processing text within real-world images, often due to unseen data or limited datasets. Both the limitations of STR and the quality of scene text images are crucial factors. Recently, a parametric weight and multiple parametric regularizations were proposed to improve the quality of real-world scene text images. Different from previous surveys in this area, this study has three main objectives. Firstly, to confirm the performance of parametric methods, the text recognition accuracy between with and without methods is compared by using different STR methods. Second, to make the comprehensive experiments, the outcomes of each STR method are compared to show their prediction performances. Third, several existing challenges and research directions are discussed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2024 IEEE International Conference on Consumer Electronics (ICCE)

自引率

0.00%

发文量