{"title":"STVDNet: spatio-temporal interactive video de-raining network","authors":"Ze Ouyang, Huihuang Zhao, Yudong Zhang, Long Chen","doi":"10.1007/s00371-024-03565-2","DOIUrl":null,"url":null,"abstract":"<p>Video de-raining is of significant importance problem in computer vision as rain streaks adversely affect the visual quality of images and hinder subsequent vision-related tasks. Existing video de-raining methods still face challenges such as black shadows and loss of details. In this paper, we introduced a novel de-raining framework called STVDNet, which effectively solves the issues of black shadows and detail loss after de-raining. STVDNet utilizes a Spatial Detail Feature Extraction Module based on an auto-encoder to capture the spatial characteristics of the video. Additionally, we introduced an innovative interaction between the extracted spatial features and Spatio-Temporal features using LSTM to generate initial de-raining results. Finally, we employed 3D convolution and 2D convolution for the detailed processing of the coarse videos. During the training process, we utilized three loss functions, among which the SSIM loss function was employed to process the generated videos, aiming to enhance their detail structure and color recovery. Through extensive experiments conducted on three public datasets, we demonstrated the superiority of our proposed method over state-of-the-art approaches. We also provide our code and pre-trained models at https://github.com/O-Y-ZONE/STVDNet.git.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03565-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Video de-raining is of significant importance problem in computer vision as rain streaks adversely affect the visual quality of images and hinder subsequent vision-related tasks. Existing video de-raining methods still face challenges such as black shadows and loss of details. In this paper, we introduced a novel de-raining framework called STVDNet, which effectively solves the issues of black shadows and detail loss after de-raining. STVDNet utilizes a Spatial Detail Feature Extraction Module based on an auto-encoder to capture the spatial characteristics of the video. Additionally, we introduced an innovative interaction between the extracted spatial features and Spatio-Temporal features using LSTM to generate initial de-raining results. Finally, we employed 3D convolution and 2D convolution for the detailed processing of the coarse videos. During the training process, we utilized three loss functions, among which the SSIM loss function was employed to process the generated videos, aiming to enhance their detail structure and color recovery. Through extensive experiments conducted on three public datasets, we demonstrated the superiority of our proposed method over state-of-the-art approaches. We also provide our code and pre-trained models at https://github.com/O-Y-ZONE/STVDNet.git.