{"title":"基于三重注意力和多尺度金字塔网络的水下图像增强","authors":"Kaichuan Sun, Yubo Tian","doi":"10.34028/iajit/20/3/11","DOIUrl":null,"url":null,"abstract":": Clear images are a prerequisite of high-level underwater vision tasks, but images captured underwater are often degraded due to absorption and scattering of light. To solve this issue, traditional methods have shown some success, but often generate unwanted artifacts for knowledge priori dependency. In contrast, learning-based approaches can produce more refined results. Most popular methods are based on an encoder-decoder configuration for simply learning the nonlinear transformation of input and output images, so their ability to capture details is limited. In addition, the significant pixel-level features and multi-scale features are often overlooked. Accordingly, we propose a novel and efficient network that incorporates triple attention and a multi-scale pyramid with an encoder-decoder architecture. Specifically, a triple attention module that captures the channel-pixel-spatial features is used as the transformation of the encoder-decoder module to focus on the fog region; then, a multi-scale pyramid module designed for refining the preliminary defog results are used to improve the restoration visibility. Extensive experiments on the EUVP and UFO-120 datasets corroborate that the proposed method outperforms the state-of-the-art methods in quantitative metrics Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Patch-based Contrast Quality Index (PCQI) and visual quality.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"179 1","pages":"387-397"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Incorporating triple attention and multi-scale pyramid network for underwater image enhancement\",\"authors\":\"Kaichuan Sun, Yubo Tian\",\"doi\":\"10.34028/iajit/20/3/11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Clear images are a prerequisite of high-level underwater vision tasks, but images captured underwater are often degraded due to absorption and scattering of light. To solve this issue, traditional methods have shown some success, but often generate unwanted artifacts for knowledge priori dependency. In contrast, learning-based approaches can produce more refined results. Most popular methods are based on an encoder-decoder configuration for simply learning the nonlinear transformation of input and output images, so their ability to capture details is limited. In addition, the significant pixel-level features and multi-scale features are often overlooked. Accordingly, we propose a novel and efficient network that incorporates triple attention and a multi-scale pyramid with an encoder-decoder architecture. Specifically, a triple attention module that captures the channel-pixel-spatial features is used as the transformation of the encoder-decoder module to focus on the fog region; then, a multi-scale pyramid module designed for refining the preliminary defog results are used to improve the restoration visibility. Extensive experiments on the EUVP and UFO-120 datasets corroborate that the proposed method outperforms the state-of-the-art methods in quantitative metrics Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Patch-based Contrast Quality Index (PCQI) and visual quality.\",\"PeriodicalId\":13624,\"journal\":{\"name\":\"Int. Arab J. Inf. Technol.\",\"volume\":\"179 1\",\"pages\":\"387-397\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. Arab J. Inf. Technol.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34028/iajit/20/3/11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. Arab J. Inf. Technol.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34028/iajit/20/3/11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Incorporating triple attention and multi-scale pyramid network for underwater image enhancement
: Clear images are a prerequisite of high-level underwater vision tasks, but images captured underwater are often degraded due to absorption and scattering of light. To solve this issue, traditional methods have shown some success, but often generate unwanted artifacts for knowledge priori dependency. In contrast, learning-based approaches can produce more refined results. Most popular methods are based on an encoder-decoder configuration for simply learning the nonlinear transformation of input and output images, so their ability to capture details is limited. In addition, the significant pixel-level features and multi-scale features are often overlooked. Accordingly, we propose a novel and efficient network that incorporates triple attention and a multi-scale pyramid with an encoder-decoder architecture. Specifically, a triple attention module that captures the channel-pixel-spatial features is used as the transformation of the encoder-decoder module to focus on the fog region; then, a multi-scale pyramid module designed for refining the preliminary defog results are used to improve the restoration visibility. Extensive experiments on the EUVP and UFO-120 datasets corroborate that the proposed method outperforms the state-of-the-art methods in quantitative metrics Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Patch-based Contrast Quality Index (PCQI) and visual quality.