Smitha Lingadahalli Ravi, F. Henry, L. Morin, Matthieu Gendrin
{"title":"Exploring Temporal Consistency in Image-Based Rendering for Immersive Video Transmission","authors":"Smitha Lingadahalli Ravi, F. Henry, L. Morin, Matthieu Gendrin","doi":"10.1109/EUVIP53989.2022.9922680","DOIUrl":null,"url":null,"abstract":"Image-based rendering methods synthesize novel views given input images captured from multiple viewpoints to display free viewpoint immersive video. Despite significant progress with the recent learning-based approaches, there are still some drawbacks. In particular, these approaches operate at the still image level and do not maintain consistency among consecutive time instants, leading to temporal noise. To address this, we propose an intra-only framework to identify regions of input images leading to temporally inconsistent synthesized views. Our method synthesizes better and more stable novel views, even in the most general use case of immersive video transmission. We conclude that the network seems to identify and correct spatial features at the still image level that produce artifacts in the temporal dimension.","PeriodicalId":120249,"journal":{"name":"2022 10th European Workshop on Visual Information Processing (EUVIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th European Workshop on Visual Information Processing (EUVIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUVIP53989.2022.9922680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Image-based rendering methods synthesize novel views given input images captured from multiple viewpoints to display free viewpoint immersive video. Despite significant progress with the recent learning-based approaches, there are still some drawbacks. In particular, these approaches operate at the still image level and do not maintain consistency among consecutive time instants, leading to temporal noise. To address this, we propose an intra-only framework to identify regions of input images leading to temporally inconsistent synthesized views. Our method synthesizes better and more stable novel views, even in the most general use case of immersive video transmission. We conclude that the network seems to identify and correct spatial features at the still image level that produce artifacts in the temporal dimension.