{"title":"SwinGS:用于任意长度体积视频流的滑动窗口高斯拼接技术","authors":"Bangya Liu, Suman Banerjee","doi":"arxiv-2409.07759","DOIUrl":null,"url":null,"abstract":"Recent advances in 3D Gaussian Splatting (3DGS) have garnered significant\nattention in computer vision and computer graphics due to its high rendering\nspeed and remarkable quality. While extant research has endeavored to extend\nthe application of 3DGS from static to dynamic scenes, such efforts have been\nconsistently impeded by excessive model sizes, constraints on video duration,\nand content deviation. These limitations significantly compromise the\nstreamability of dynamic 3D Gaussian models, thereby restricting their utility\nin downstream applications, including volumetric video, autonomous vehicle, and\nimmersive technologies such as virtual, augmented, and mixed reality. This paper introduces SwinGS, a novel framework for training, delivering, and\nrendering volumetric video in a real-time streaming fashion. To address the\naforementioned challenges and enhance streamability, SwinGS integrates\nspacetime Gaussian with Markov Chain Monte Carlo (MCMC) to adapt the model to\nfit various 3D scenes across frames, in the meantime employing a sliding window\ncaptures Gaussian snapshots for each frame in an accumulative way. We implement\na prototype of SwinGS and demonstrate its streamability across various datasets\nand scenes. Additionally, we develop an interactive WebGL viewer enabling\nreal-time volumetric video playback on most devices with modern browsers,\nincluding smartphones and tablets. Experimental results show that SwinGS\nreduces transmission costs by 83.6% compared to previous work with ignorable\ncompromise in PSNR. Moreover, SwinGS easily scales to long video sequences\nwithout compromising quality.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"35 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length\",\"authors\":\"Bangya Liu, Suman Banerjee\",\"doi\":\"arxiv-2409.07759\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advances in 3D Gaussian Splatting (3DGS) have garnered significant\\nattention in computer vision and computer graphics due to its high rendering\\nspeed and remarkable quality. While extant research has endeavored to extend\\nthe application of 3DGS from static to dynamic scenes, such efforts have been\\nconsistently impeded by excessive model sizes, constraints on video duration,\\nand content deviation. These limitations significantly compromise the\\nstreamability of dynamic 3D Gaussian models, thereby restricting their utility\\nin downstream applications, including volumetric video, autonomous vehicle, and\\nimmersive technologies such as virtual, augmented, and mixed reality. This paper introduces SwinGS, a novel framework for training, delivering, and\\nrendering volumetric video in a real-time streaming fashion. To address the\\naforementioned challenges and enhance streamability, SwinGS integrates\\nspacetime Gaussian with Markov Chain Monte Carlo (MCMC) to adapt the model to\\nfit various 3D scenes across frames, in the meantime employing a sliding window\\ncaptures Gaussian snapshots for each frame in an accumulative way. We implement\\na prototype of SwinGS and demonstrate its streamability across various datasets\\nand scenes. Additionally, we develop an interactive WebGL viewer enabling\\nreal-time volumetric video playback on most devices with modern browsers,\\nincluding smartphones and tablets. Experimental results show that SwinGS\\nreduces transmission costs by 83.6% compared to previous work with ignorable\\ncompromise in PSNR. Moreover, SwinGS easily scales to long video sequences\\nwithout compromising quality.\",\"PeriodicalId\":501480,\"journal\":{\"name\":\"arXiv - CS - Multimedia\",\"volume\":\"35 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07759\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length
Recent advances in 3D Gaussian Splatting (3DGS) have garnered significant
attention in computer vision and computer graphics due to its high rendering
speed and remarkable quality. While extant research has endeavored to extend
the application of 3DGS from static to dynamic scenes, such efforts have been
consistently impeded by excessive model sizes, constraints on video duration,
and content deviation. These limitations significantly compromise the
streamability of dynamic 3D Gaussian models, thereby restricting their utility
in downstream applications, including volumetric video, autonomous vehicle, and
immersive technologies such as virtual, augmented, and mixed reality. This paper introduces SwinGS, a novel framework for training, delivering, and
rendering volumetric video in a real-time streaming fashion. To address the
aforementioned challenges and enhance streamability, SwinGS integrates
spacetime Gaussian with Markov Chain Monte Carlo (MCMC) to adapt the model to
fit various 3D scenes across frames, in the meantime employing a sliding window
captures Gaussian snapshots for each frame in an accumulative way. We implement
a prototype of SwinGS and demonstrate its streamability across various datasets
and scenes. Additionally, we develop an interactive WebGL viewer enabling
real-time volumetric video playback on most devices with modern browsers,
including smartphones and tablets. Experimental results show that SwinGS
reduces transmission costs by 83.6% compared to previous work with ignorable
compromise in PSNR. Moreover, SwinGS easily scales to long video sequences
without compromising quality.