Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki
{"title":"利用参数框架对场景文本图像超分辨率中的场景文本识别进行综合研究","authors":"Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki","doi":"10.1109/ICCE59016.2024.10444229","DOIUrl":null,"url":null,"abstract":"Scene Text Recognition (STR) is a technique to detect and recognize text in images. Predicting text in real-world scene images is challenging due to various uncontrollable environmental factors. State-of-the-art text detection and recognition models leverage deep learning and Transformer architectures, consequently achieving impressive accuracy on benchmark datasets. However, challenges persist in accurately processing text within real-world images, often due to unseen data or limited datasets. Both the limitations of STR and the quality of scene text images are crucial factors. Recently, a parametric weight and multiple parametric regularizations were proposed to improve the quality of real-world scene text images. Different from previous surveys in this area, this study has three main objectives. Firstly, to confirm the performance of parametric methods, the text recognition accuracy between with and without methods is compared by using different STR methods. Second, to make the comprehensive experiments, the outcomes of each STR method are compared to show their prediction performances. Third, several existing challenges and research directions are discussed.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"95 6","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comprehensive Study of Scene Text Recognition in Scene Text Image Super-Resolution with Parametric Frameworks\",\"authors\":\"Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki\",\"doi\":\"10.1109/ICCE59016.2024.10444229\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scene Text Recognition (STR) is a technique to detect and recognize text in images. Predicting text in real-world scene images is challenging due to various uncontrollable environmental factors. State-of-the-art text detection and recognition models leverage deep learning and Transformer architectures, consequently achieving impressive accuracy on benchmark datasets. However, challenges persist in accurately processing text within real-world images, often due to unseen data or limited datasets. Both the limitations of STR and the quality of scene text images are crucial factors. Recently, a parametric weight and multiple parametric regularizations were proposed to improve the quality of real-world scene text images. Different from previous surveys in this area, this study has three main objectives. Firstly, to confirm the performance of parametric methods, the text recognition accuracy between with and without methods is compared by using different STR methods. Second, to make the comprehensive experiments, the outcomes of each STR method are compared to show their prediction performances. Third, several existing challenges and research directions are discussed.\",\"PeriodicalId\":518694,\"journal\":{\"name\":\"2024 IEEE International Conference on Consumer Electronics (ICCE)\",\"volume\":\"95 6\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2024 IEEE International Conference on Consumer Electronics (ICCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCE59016.2024.10444229\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE International Conference on Consumer Electronics (ICCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE59016.2024.10444229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Comprehensive Study of Scene Text Recognition in Scene Text Image Super-Resolution with Parametric Frameworks
Scene Text Recognition (STR) is a technique to detect and recognize text in images. Predicting text in real-world scene images is challenging due to various uncontrollable environmental factors. State-of-the-art text detection and recognition models leverage deep learning and Transformer architectures, consequently achieving impressive accuracy on benchmark datasets. However, challenges persist in accurately processing text within real-world images, often due to unseen data or limited datasets. Both the limitations of STR and the quality of scene text images are crucial factors. Recently, a parametric weight and multiple parametric regularizations were proposed to improve the quality of real-world scene text images. Different from previous surveys in this area, this study has three main objectives. Firstly, to confirm the performance of parametric methods, the text recognition accuracy between with and without methods is compared by using different STR methods. Second, to make the comprehensive experiments, the outcomes of each STR method are compared to show their prediction performances. Third, several existing challenges and research directions are discussed.