Yujun Huang, Bin Chen, Niu Lian, Baoyi An, Shu-Tao Xia
{"title":"3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors","authors":"Yujun Huang, Bin Chen, Niu Lian, Baoyi An, Shu-Tao Xia","doi":"arxiv-2409.04013","DOIUrl":null,"url":null,"abstract":"Multi-view image compression is vital for 3D-related applications. To\neffectively model correlations between views, existing methods typically\npredict disparity between two views on a 2D plane, which works well for small\ndisparities, such as in stereo images, but struggles with larger disparities\ncaused by significant view changes. To address this, we propose a novel\napproach: learning-based multi-view image coding with 3D Gaussian geometric\npriors (3D-GP-LMVIC). Our method leverages 3D Gaussian Splatting to derive\ngeometric priors of the 3D scene, enabling more accurate disparity estimation\nacross views within the compression model. Additionally, we introduce a depth\nmap compression model to reduce redundancy in geometric information between\nviews. A multi-view sequence ordering method is also proposed to enhance\ncorrelations between adjacent views. Experimental results demonstrate that\n3D-GP-LMVIC surpasses both traditional and learning-based methods in\nperformance, while maintaining fast encoding and decoding speed.","PeriodicalId":501082,"journal":{"name":"arXiv - MATH - Information Theory","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-view image compression is vital for 3D-related applications. To
effectively model correlations between views, existing methods typically
predict disparity between two views on a 2D plane, which works well for small
disparities, such as in stereo images, but struggles with larger disparities
caused by significant view changes. To address this, we propose a novel
approach: learning-based multi-view image coding with 3D Gaussian geometric
priors (3D-GP-LMVIC). Our method leverages 3D Gaussian Splatting to derive
geometric priors of the 3D scene, enabling more accurate disparity estimation
across views within the compression model. Additionally, we introduce a depth
map compression model to reduce redundancy in geometric information between
views. A multi-view sequence ordering method is also proposed to enhance
correlations between adjacent views. Experimental results demonstrate that
3D-GP-LMVIC surpasses both traditional and learning-based methods in
performance, while maintaining fast encoding and decoding speed.