{"title":"Intra- and Inter-frame Iterative Temporal Convolutional Networks for Video Stabilization","authors":"Haopeng Xie, Liang Xiao, Huicong Wu","doi":"10.1145/3469877.3490608","DOIUrl":null,"url":null,"abstract":"Video jitter is an uncomfortable product of irregular lens motion in time sequence. How to extract motion state information in a period of continuous video frames is a major issue for video stabilization. In this paper, we propose a novel sequence model, Intra- and Inter-frame Iterative Temporal Convolutional Networks (I3TC-Net), which alternatively transfer the spatial-temporal correlation of motion within and between frames. We hypothesize that the motion state information can be represented by transmission states. Specifically, we employ combination of Convolutional Long Short-Term Memory (ConvLSTM) and embedded encoder-decoder to generate the latent stable frame, which are used to update transmission states iteratively and learn a global homography transformation effectively for each unstable frame to generate the corresponding stabilized result along the time axis. Furthermore, we create a video dataset to solve the lack of stable data and improve the training effect. Experimental results show that our method outperforms state-of-the-art results on publicly available videos, such as 5.4 points improvements in stability score. The project page is available at https://github.com/root2022IIITC/IIITC.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"42 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469877.3490608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Video jitter is an uncomfortable product of irregular lens motion in time sequence. How to extract motion state information in a period of continuous video frames is a major issue for video stabilization. In this paper, we propose a novel sequence model, Intra- and Inter-frame Iterative Temporal Convolutional Networks (I3TC-Net), which alternatively transfer the spatial-temporal correlation of motion within and between frames. We hypothesize that the motion state information can be represented by transmission states. Specifically, we employ combination of Convolutional Long Short-Term Memory (ConvLSTM) and embedded encoder-decoder to generate the latent stable frame, which are used to update transmission states iteratively and learn a global homography transformation effectively for each unstable frame to generate the corresponding stabilized result along the time axis. Furthermore, we create a video dataset to solve the lack of stable data and improve the training effect. Experimental results show that our method outperforms state-of-the-art results on publicly available videos, such as 5.4 points improvements in stability score. The project page is available at https://github.com/root2022IIITC/IIITC.