基于三重注意力和多尺度金字塔网络的水下图像增强

Int. Arab J. Inf. Technol. Pub Date : 2023-01-01 DOI:10.34028/iajit/20/3/11

Kaichuan Sun, Yubo Tian

{"title":"基于三重注意力和多尺度金字塔网络的水下图像增强","authors":"Kaichuan Sun, Yubo Tian","doi":"10.34028/iajit/20/3/11","DOIUrl":null,"url":null,"abstract":": Clear images are a prerequisite of high-level underwater vision tasks, but images captured underwater are often degraded due to absorption and scattering of light. To solve this issue, traditional methods have shown some success, but often generate unwanted artifacts for knowledge priori dependency. In contrast, learning-based approaches can produce more refined results. Most popular methods are based on an encoder-decoder configuration for simply learning the nonlinear transformation of input and output images, so their ability to capture details is limited. In addition, the significant pixel-level features and multi-scale features are often overlooked. Accordingly, we propose a novel and efficient network that incorporates triple attention and a multi-scale pyramid with an encoder-decoder architecture. Specifically, a triple attention module that captures the channel-pixel-spatial features is used as the transformation of the encoder-decoder module to focus on the fog region; then, a multi-scale pyramid module designed for refining the preliminary defog results are used to improve the restoration visibility. Extensive experiments on the EUVP and UFO-120 datasets corroborate that the proposed method outperforms the state-of-the-art methods in quantitative metrics Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Patch-based Contrast Quality Index (PCQI) and visual quality.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"179 1","pages":"387-397"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Incorporating triple attention and multi-scale pyramid network for underwater image enhancement\",\"authors\":\"Kaichuan Sun, Yubo Tian\",\"doi\":\"10.34028/iajit/20/3/11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Clear images are a prerequisite of high-level underwater vision tasks, but images captured underwater are often degraded due to absorption and scattering of light. To solve this issue, traditional methods have shown some success, but often generate unwanted artifacts for knowledge priori dependency. In contrast, learning-based approaches can produce more refined results. Most popular methods are based on an encoder-decoder configuration for simply learning the nonlinear transformation of input and output images, so their ability to capture details is limited. In addition, the significant pixel-level features and multi-scale features are often overlooked. Accordingly, we propose a novel and efficient network that incorporates triple attention and a multi-scale pyramid with an encoder-decoder architecture. Specifically, a triple attention module that captures the channel-pixel-spatial features is used as the transformation of the encoder-decoder module to focus on the fog region; then, a multi-scale pyramid module designed for refining the preliminary defog results are used to improve the restoration visibility. Extensive experiments on the EUVP and UFO-120 datasets corroborate that the proposed method outperforms the state-of-the-art methods in quantitative metrics Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Patch-based Contrast Quality Index (PCQI) and visual quality.\",\"PeriodicalId\":13624,\"journal\":{\"name\":\"Int. Arab J. Inf. Technol.\",\"volume\":\"179 1\",\"pages\":\"387-397\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. Arab J. Inf. Technol.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34028/iajit/20/3/11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. Arab J. Inf. Technol.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34028/iajit/20/3/11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

清晰的图像是高水平水下视觉任务的先决条件，但由于光的吸收和散射，水下捕获的图像往往会下降。为了解决这个问题，传统的方法已经取得了一些成功，但是经常为知识先验依赖产生不必要的工件。相比之下，基于学习的方法可以产生更精确的结果。最流行的方法是基于编码器-解码器配置，用于简单地学习输入和输出图像的非线性变换，因此它们捕获细节的能力有限。此外，重要的像素级特征和多尺度特征往往被忽略。因此，我们提出了一种新颖高效的网络，它结合了三重注意力和具有编码器-解码器结构的多尺度金字塔。具体来说，使用捕获通道-像素-空间特征的三重关注模块作为编码器-解码器模块的转换，以聚焦于雾区域;然后，设计了一个多尺度金字塔模块，用于细化初步除雾结果，以提高恢复可见性。在EUVP和UFO-120数据集上进行的大量实验证实，所提出的方法在定量指标峰值信噪比(PSNR)、结构相似性(SSIM)、基于斑块的对比度质量指数(PCQI)和视觉质量方面优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Incorporating triple attention and multi-scale pyramid network for underwater image enhancement

: Clear images are a prerequisite of high-level underwater vision tasks, but images captured underwater are often degraded due to absorption and scattering of light. To solve this issue, traditional methods have shown some success, but often generate unwanted artifacts for knowledge priori dependency. In contrast, learning-based approaches can produce more refined results. Most popular methods are based on an encoder-decoder configuration for simply learning the nonlinear transformation of input and output images, so their ability to capture details is limited. In addition, the significant pixel-level features and multi-scale features are often overlooked. Accordingly, we propose a novel and efficient network that incorporates triple attention and a multi-scale pyramid with an encoder-decoder architecture. Specifically, a triple attention module that captures the channel-pixel-spatial features is used as the transformation of the encoder-decoder module to focus on the fog region; then, a multi-scale pyramid module designed for refining the preliminary defog results are used to improve the restoration visibility. Extensive experiments on the EUVP and UFO-120 datasets corroborate that the proposed method outperforms the state-of-the-art methods in quantitative metrics Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Patch-based Contrast Quality Index (PCQI) and visual quality.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Int. Arab J. Inf. Technol.

自引率

0.00%

发文量