Dan Xiang , Zebin Zhou , Wenlei Yang , Huihua Wang , Pan Gao , Mingming Xiao , Jinwen Zhang , Xing Zhu
{"title":"用于水下图像增强的多尺度卷积和三分支级联变换器融合框架","authors":"Dan Xiang , Zebin Zhou , Wenlei Yang , Huihua Wang , Pan Gao , Mingming Xiao , Jinwen Zhang , Xing Zhu","doi":"10.1016/j.optlaseng.2024.108640","DOIUrl":null,"url":null,"abstract":"<div><div>Acquiring high-quality underwater images is critical for various marine applications. However, light absorption and scattering problems in underwater environments severely degrade image quality. To address these issues, this study proposes a Fusion Framework with Multi-Scale Convolution and Triple-Branch Cascaded Transformer for Underwater Image Enhancement(FMTformer). This innovative framework incorporates multi-scale convolution and three-branch cascade transformer to enhance underwater images effectively. The FMTformer framework adds in the Multi-Conv Multi-Scale Fusion (MCMF) mechanism, which utilizes a spectrum of convolutional kernels to adeptly extract multi-scale features from both the base and detail layers of the decomposed image. This method ensures the capture of both high- and low-frequency information. Furthermore, this research introduces the Tri-Branch Self-Attention Transformer (TBSAT), designed to get cross-dimensional interactions via its Tri-Branch structure, significantly refines image processing quality. The framework also embedded the Value Reconstruct Cascade Transformer (VRCT), which refines feature map representation through mixed convolution, yielding enriched attention maps. Empirical evidence indicates that FMTformer achieves parity with the state-of-the-art in both subjective and objective evaluation metrics, outperforming extant methodologies.</div></div>","PeriodicalId":49719,"journal":{"name":"Optics and Lasers in Engineering","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A fusion framework with multi-scale convolution and triple-branch cascaded transformer for underwater image enhancement\",\"authors\":\"Dan Xiang , Zebin Zhou , Wenlei Yang , Huihua Wang , Pan Gao , Mingming Xiao , Jinwen Zhang , Xing Zhu\",\"doi\":\"10.1016/j.optlaseng.2024.108640\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Acquiring high-quality underwater images is critical for various marine applications. However, light absorption and scattering problems in underwater environments severely degrade image quality. To address these issues, this study proposes a Fusion Framework with Multi-Scale Convolution and Triple-Branch Cascaded Transformer for Underwater Image Enhancement(FMTformer). This innovative framework incorporates multi-scale convolution and three-branch cascade transformer to enhance underwater images effectively. The FMTformer framework adds in the Multi-Conv Multi-Scale Fusion (MCMF) mechanism, which utilizes a spectrum of convolutional kernels to adeptly extract multi-scale features from both the base and detail layers of the decomposed image. This method ensures the capture of both high- and low-frequency information. Furthermore, this research introduces the Tri-Branch Self-Attention Transformer (TBSAT), designed to get cross-dimensional interactions via its Tri-Branch structure, significantly refines image processing quality. The framework also embedded the Value Reconstruct Cascade Transformer (VRCT), which refines feature map representation through mixed convolution, yielding enriched attention maps. Empirical evidence indicates that FMTformer achieves parity with the state-of-the-art in both subjective and objective evaluation metrics, outperforming extant methodologies.</div></div>\",\"PeriodicalId\":49719,\"journal\":{\"name\":\"Optics and Lasers in Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optics and Lasers in Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0143816624006183\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Lasers in Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143816624006183","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}
A fusion framework with multi-scale convolution and triple-branch cascaded transformer for underwater image enhancement
Acquiring high-quality underwater images is critical for various marine applications. However, light absorption and scattering problems in underwater environments severely degrade image quality. To address these issues, this study proposes a Fusion Framework with Multi-Scale Convolution and Triple-Branch Cascaded Transformer for Underwater Image Enhancement(FMTformer). This innovative framework incorporates multi-scale convolution and three-branch cascade transformer to enhance underwater images effectively. The FMTformer framework adds in the Multi-Conv Multi-Scale Fusion (MCMF) mechanism, which utilizes a spectrum of convolutional kernels to adeptly extract multi-scale features from both the base and detail layers of the decomposed image. This method ensures the capture of both high- and low-frequency information. Furthermore, this research introduces the Tri-Branch Self-Attention Transformer (TBSAT), designed to get cross-dimensional interactions via its Tri-Branch structure, significantly refines image processing quality. The framework also embedded the Value Reconstruct Cascade Transformer (VRCT), which refines feature map representation through mixed convolution, yielding enriched attention maps. Empirical evidence indicates that FMTformer achieves parity with the state-of-the-art in both subjective and objective evaluation metrics, outperforming extant methodologies.
期刊介绍:
Optics and Lasers in Engineering aims at providing an international forum for the interchange of information on the development of optical techniques and laser technology in engineering. Emphasis is placed on contributions targeted at the practical use of methods and devices, the development and enhancement of solutions and new theoretical concepts for experimental methods.
Optics and Lasers in Engineering reflects the main areas in which optical methods are being used and developed for an engineering environment. Manuscripts should offer clear evidence of novelty and significance. Papers focusing on parameter optimization or computational issues are not suitable. Similarly, papers focussed on an application rather than the optical method fall outside the journal''s scope. The scope of the journal is defined to include the following:
-Optical Metrology-
Optical Methods for 3D visualization and virtual engineering-
Optical Techniques for Microsystems-
Imaging, Microscopy and Adaptive Optics-
Computational Imaging-
Laser methods in manufacturing-
Integrated optical and photonic sensors-
Optics and Photonics in Life Science-
Hyperspectral and spectroscopic methods-
Infrared and Terahertz techniques