Mingliang Gao , Jianhao Sun , Qilei Li , Muhammad Attique Khan , Jianrun Shang , Xianxun Zhu , Gwanggil Jeon
{"title":"利用对称递归人工神经网络实现可信图像超分辨率","authors":"Mingliang Gao , Jianhao Sun , Qilei Li , Muhammad Attique Khan , Jianrun Shang , Xianxun Zhu , Gwanggil Jeon","doi":"10.1016/j.imavis.2025.105519","DOIUrl":null,"url":null,"abstract":"<div><div>AI-assisted living environments by widely apply the image super-resolution technique to improve the clarity of visual inputs for devices like smart cameras and medical monitors. This increased resolution enables more accurate object recognition, facial identification, and health monitoring, contributing to a safer and more efficient assisted living experience. Although rapid progress has been achieved, most current methods suffer from huge computational costs due to the complex network structures. To address this problem, we propose a symmetrical and recursive transformer network (SRTNet) for efficient image super-resolution via integrating the symmetrical CNN (S-CNN) unit and improved recursive Transformer (IRT) unit. Specifically, the S-CNN unit is equipped with a designed local feature enhancement (LFE) module and a feature distillation attention in attention (FDAA) block to realize efficient feature extraction and utilization. The IRT unit is introduced to capture long-range dependencies and contextual information to guarantee that the reconstruction image preserves high-frequency texture details. Extensive experiments demonstrate that the proposed SRTNet achieves competitive performance regarding reconstruction quality and model complexity compared with the state-of-the-art methods. In the <span><math><mrow><mo>×</mo><mn>2</mn></mrow></math></span>, <span><math><mrow><mo>×</mo><mn>3</mn></mrow></math></span>, and <span><math><mrow><mo>×</mo><mn>4</mn></mrow></math></span> super-resolution tasks, SRTNet achieves the best performance on the BSD100, Set14, Set5, Manga109, and Urban100 datasets while maintaining low computational complexity.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"158 ","pages":"Article 105519"},"PeriodicalIF":4.2000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards trustworthy image super-resolution via symmetrical and recursive artificial neural network\",\"authors\":\"Mingliang Gao , Jianhao Sun , Qilei Li , Muhammad Attique Khan , Jianrun Shang , Xianxun Zhu , Gwanggil Jeon\",\"doi\":\"10.1016/j.imavis.2025.105519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>AI-assisted living environments by widely apply the image super-resolution technique to improve the clarity of visual inputs for devices like smart cameras and medical monitors. This increased resolution enables more accurate object recognition, facial identification, and health monitoring, contributing to a safer and more efficient assisted living experience. Although rapid progress has been achieved, most current methods suffer from huge computational costs due to the complex network structures. To address this problem, we propose a symmetrical and recursive transformer network (SRTNet) for efficient image super-resolution via integrating the symmetrical CNN (S-CNN) unit and improved recursive Transformer (IRT) unit. Specifically, the S-CNN unit is equipped with a designed local feature enhancement (LFE) module and a feature distillation attention in attention (FDAA) block to realize efficient feature extraction and utilization. The IRT unit is introduced to capture long-range dependencies and contextual information to guarantee that the reconstruction image preserves high-frequency texture details. Extensive experiments demonstrate that the proposed SRTNet achieves competitive performance regarding reconstruction quality and model complexity compared with the state-of-the-art methods. In the <span><math><mrow><mo>×</mo><mn>2</mn></mrow></math></span>, <span><math><mrow><mo>×</mo><mn>3</mn></mrow></math></span>, and <span><math><mrow><mo>×</mo><mn>4</mn></mrow></math></span> super-resolution tasks, SRTNet achieves the best performance on the BSD100, Set14, Set5, Manga109, and Urban100 datasets while maintaining low computational complexity.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"158 \",\"pages\":\"Article 105519\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625001076\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625001076","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Towards trustworthy image super-resolution via symmetrical and recursive artificial neural network
AI-assisted living environments by widely apply the image super-resolution technique to improve the clarity of visual inputs for devices like smart cameras and medical monitors. This increased resolution enables more accurate object recognition, facial identification, and health monitoring, contributing to a safer and more efficient assisted living experience. Although rapid progress has been achieved, most current methods suffer from huge computational costs due to the complex network structures. To address this problem, we propose a symmetrical and recursive transformer network (SRTNet) for efficient image super-resolution via integrating the symmetrical CNN (S-CNN) unit and improved recursive Transformer (IRT) unit. Specifically, the S-CNN unit is equipped with a designed local feature enhancement (LFE) module and a feature distillation attention in attention (FDAA) block to realize efficient feature extraction and utilization. The IRT unit is introduced to capture long-range dependencies and contextual information to guarantee that the reconstruction image preserves high-frequency texture details. Extensive experiments demonstrate that the proposed SRTNet achieves competitive performance regarding reconstruction quality and model complexity compared with the state-of-the-art methods. In the , , and super-resolution tasks, SRTNet achieves the best performance on the BSD100, Set14, Set5, Manga109, and Urban100 datasets while maintaining low computational complexity.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.