Chang Wu , Gang He , Wanlin Zhao , Xinquan Lai , Yunsong Li
{"title":"PMCN:用于立体视频去毛刺的视差-运动协作网络","authors":"Chang Wu , Gang He , Wanlin Zhao , Xinquan Lai , Yunsong Li","doi":"10.1016/j.knosys.2024.112681","DOIUrl":null,"url":null,"abstract":"<div><div>Despite progress in learning-based stereo dehazing, few studies have focused on stereo video dehazing (SVD). Existing methods may fall short in the SVD task by not fully leveraging multi-domain information. To address this gap, we propose a parallax-motion collaboration network (PMCN) that integrates parallax and motion information for efficient stereo video fog removal. We delicately design a parallax-motion collaboration block (PMCB) as the critical component of PMCN. Firstly, to capture binocular parallax correspondences more efficiently, we introduce a window-based parallax attention mechanism (W-PAM) in the parallax interaction module (PIM) of PMCB. By horizontally splitting the whole frame into multiple windows and extracting parallax relationships within each window, memory usage and runtime can be reduced. Meanwhile, we further conduct horizontal feature modulation to handle cross-window disparity variations. Secondly, a motion alignment module (MAM) based on deformable convolution explores the temporal correlation in the feature space for an independent view. Finally, we propose a fog-adaptive refinement module (FARM) to refine the features after interaction and alignment. FARM incorporates fog prior information and guides the network in dynamically generating processing kernels for dehazing to adapt to different fog scenarios. Quantitative and qualitative results demonstrate that the proposed PMCN outperforms state-of-the-art methods on both synthetic and real-world datasets. In addition, our PMCN also benefits the accuracy improvement for high-level vision tasks in fog scenes, e.g., object detection and stereo matching.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112681"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PMCN: Parallax-motion collaboration network for stereo video dehazing\",\"authors\":\"Chang Wu , Gang He , Wanlin Zhao , Xinquan Lai , Yunsong Li\",\"doi\":\"10.1016/j.knosys.2024.112681\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Despite progress in learning-based stereo dehazing, few studies have focused on stereo video dehazing (SVD). Existing methods may fall short in the SVD task by not fully leveraging multi-domain information. To address this gap, we propose a parallax-motion collaboration network (PMCN) that integrates parallax and motion information for efficient stereo video fog removal. We delicately design a parallax-motion collaboration block (PMCB) as the critical component of PMCN. Firstly, to capture binocular parallax correspondences more efficiently, we introduce a window-based parallax attention mechanism (W-PAM) in the parallax interaction module (PIM) of PMCB. By horizontally splitting the whole frame into multiple windows and extracting parallax relationships within each window, memory usage and runtime can be reduced. Meanwhile, we further conduct horizontal feature modulation to handle cross-window disparity variations. Secondly, a motion alignment module (MAM) based on deformable convolution explores the temporal correlation in the feature space for an independent view. Finally, we propose a fog-adaptive refinement module (FARM) to refine the features after interaction and alignment. FARM incorporates fog prior information and guides the network in dynamically generating processing kernels for dehazing to adapt to different fog scenarios. Quantitative and qualitative results demonstrate that the proposed PMCN outperforms state-of-the-art methods on both synthetic and real-world datasets. In addition, our PMCN also benefits the accuracy improvement for high-level vision tasks in fog scenes, e.g., object detection and stereo matching.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"305 \",\"pages\":\"Article 112681\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705124013157\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013157","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
PMCN: Parallax-motion collaboration network for stereo video dehazing
Despite progress in learning-based stereo dehazing, few studies have focused on stereo video dehazing (SVD). Existing methods may fall short in the SVD task by not fully leveraging multi-domain information. To address this gap, we propose a parallax-motion collaboration network (PMCN) that integrates parallax and motion information for efficient stereo video fog removal. We delicately design a parallax-motion collaboration block (PMCB) as the critical component of PMCN. Firstly, to capture binocular parallax correspondences more efficiently, we introduce a window-based parallax attention mechanism (W-PAM) in the parallax interaction module (PIM) of PMCB. By horizontally splitting the whole frame into multiple windows and extracting parallax relationships within each window, memory usage and runtime can be reduced. Meanwhile, we further conduct horizontal feature modulation to handle cross-window disparity variations. Secondly, a motion alignment module (MAM) based on deformable convolution explores the temporal correlation in the feature space for an independent view. Finally, we propose a fog-adaptive refinement module (FARM) to refine the features after interaction and alignment. FARM incorporates fog prior information and guides the network in dynamically generating processing kernels for dehazing to adapt to different fog scenarios. Quantitative and qualitative results demonstrate that the proposed PMCN outperforms state-of-the-art methods on both synthetic and real-world datasets. In addition, our PMCN also benefits the accuracy improvement for high-level vision tasks in fog scenes, e.g., object detection and stereo matching.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.