Yujun Huang, Bin Chen, Niu Lian, Baoyi An, Shu-Tao Xia
{"title":"3D-GP-LMVIC:基于学习的多视图图像编码与 3D 高斯几何先验","authors":"Yujun Huang, Bin Chen, Niu Lian, Baoyi An, Shu-Tao Xia","doi":"arxiv-2409.04013","DOIUrl":null,"url":null,"abstract":"Multi-view image compression is vital for 3D-related applications. To\neffectively model correlations between views, existing methods typically\npredict disparity between two views on a 2D plane, which works well for small\ndisparities, such as in stereo images, but struggles with larger disparities\ncaused by significant view changes. To address this, we propose a novel\napproach: learning-based multi-view image coding with 3D Gaussian geometric\npriors (3D-GP-LMVIC). Our method leverages 3D Gaussian Splatting to derive\ngeometric priors of the 3D scene, enabling more accurate disparity estimation\nacross views within the compression model. Additionally, we introduce a depth\nmap compression model to reduce redundancy in geometric information between\nviews. A multi-view sequence ordering method is also proposed to enhance\ncorrelations between adjacent views. Experimental results demonstrate that\n3D-GP-LMVIC surpasses both traditional and learning-based methods in\nperformance, while maintaining fast encoding and decoding speed.","PeriodicalId":501082,"journal":{"name":"arXiv - MATH - Information Theory","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors\",\"authors\":\"Yujun Huang, Bin Chen, Niu Lian, Baoyi An, Shu-Tao Xia\",\"doi\":\"arxiv-2409.04013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-view image compression is vital for 3D-related applications. To\\neffectively model correlations between views, existing methods typically\\npredict disparity between two views on a 2D plane, which works well for small\\ndisparities, such as in stereo images, but struggles with larger disparities\\ncaused by significant view changes. To address this, we propose a novel\\napproach: learning-based multi-view image coding with 3D Gaussian geometric\\npriors (3D-GP-LMVIC). Our method leverages 3D Gaussian Splatting to derive\\ngeometric priors of the 3D scene, enabling more accurate disparity estimation\\nacross views within the compression model. Additionally, we introduce a depth\\nmap compression model to reduce redundancy in geometric information between\\nviews. A multi-view sequence ordering method is also proposed to enhance\\ncorrelations between adjacent views. Experimental results demonstrate that\\n3D-GP-LMVIC surpasses both traditional and learning-based methods in\\nperformance, while maintaining fast encoding and decoding speed.\",\"PeriodicalId\":501082,\"journal\":{\"name\":\"arXiv - MATH - Information Theory\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Information Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.04013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors
Multi-view image compression is vital for 3D-related applications. To
effectively model correlations between views, existing methods typically
predict disparity between two views on a 2D plane, which works well for small
disparities, such as in stereo images, but struggles with larger disparities
caused by significant view changes. To address this, we propose a novel
approach: learning-based multi-view image coding with 3D Gaussian geometric
priors (3D-GP-LMVIC). Our method leverages 3D Gaussian Splatting to derive
geometric priors of the 3D scene, enabling more accurate disparity estimation
across views within the compression model. Additionally, we introduce a depth
map compression model to reduce redundancy in geometric information between
views. A multi-view sequence ordering method is also proposed to enhance
correlations between adjacent views. Experimental results demonstrate that
3D-GP-LMVIC surpasses both traditional and learning-based methods in
performance, while maintaining fast encoding and decoding speed.