Wan Li;Xiao Pan;Jiaxin Lin;Ping Lu;Daquan Feng;Wenzhe Shi
{"title":"FRPGS:快速,鲁棒,逼真的单目动态场景重建与可变形的三维高斯","authors":"Wan Li;Xiao Pan;Jiaxin Lin;Ping Lu;Daquan Feng;Wenzhe Shi","doi":"10.1109/TCSVT.2025.3557012","DOIUrl":null,"url":null,"abstract":"Dynamic reconstruction technology presents significant promise for applications in visual and interactive fields. Current techniques utilizing 3D Gaussian Splatting show favorable results and fast reconstruction speed. However, as scene expanding, using individual Gaussian structure 1) leads to instability in large-scale dynamic reconstruction, marked by abrupt deformation, and 2) the heuristic densification of individuals suffers significant redundancy. Tackling these issues, we propose a jointed Gaussian representation method named FRPGS, which learns the global information and the deformation using center Gaussians and generates the neural Gaussians around them for local detail. Specifically, FRPGS employs center Gaussians initialized from point clouds, which are learned with a deformation field for representing global relationships and dynamic motion over time. Then, for each center Gaussian, attribute networks generate neural Gaussians that move under the linked center Gaussian driving, thereby ensuring structural integrity during movement within this joint-based representation. Finally, to reduce Gaussian redundancy, a densification strategy is developed based on the average cumulative gradient of the associated neural Gaussians, imposing strict limits on the growing of center Gaussians without compromising accuracy. Additionally, we established a large-scale dynamic indoor dataset at the MuLong Laboratory of ZTE Corporation. Evaluations demonstrate that FRPGS significantly outperforms state-of-the-art methods in both training efficiency and reconstruction quality, achieving over a 50% (up to 74%) improvement in efficiency on an RTX 4090. FRPGS also supports the 4K resolution reconstruction of 60 frames simultaneously.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9119-9131"},"PeriodicalIF":11.1000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FRPGS: Fast, Robust, and Photorealistic Monocular Dynamic Scene Reconstruction With Deformable 3D Gaussians\",\"authors\":\"Wan Li;Xiao Pan;Jiaxin Lin;Ping Lu;Daquan Feng;Wenzhe Shi\",\"doi\":\"10.1109/TCSVT.2025.3557012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dynamic reconstruction technology presents significant promise for applications in visual and interactive fields. Current techniques utilizing 3D Gaussian Splatting show favorable results and fast reconstruction speed. However, as scene expanding, using individual Gaussian structure 1) leads to instability in large-scale dynamic reconstruction, marked by abrupt deformation, and 2) the heuristic densification of individuals suffers significant redundancy. Tackling these issues, we propose a jointed Gaussian representation method named FRPGS, which learns the global information and the deformation using center Gaussians and generates the neural Gaussians around them for local detail. Specifically, FRPGS employs center Gaussians initialized from point clouds, which are learned with a deformation field for representing global relationships and dynamic motion over time. Then, for each center Gaussian, attribute networks generate neural Gaussians that move under the linked center Gaussian driving, thereby ensuring structural integrity during movement within this joint-based representation. Finally, to reduce Gaussian redundancy, a densification strategy is developed based on the average cumulative gradient of the associated neural Gaussians, imposing strict limits on the growing of center Gaussians without compromising accuracy. Additionally, we established a large-scale dynamic indoor dataset at the MuLong Laboratory of ZTE Corporation. Evaluations demonstrate that FRPGS significantly outperforms state-of-the-art methods in both training efficiency and reconstruction quality, achieving over a 50% (up to 74%) improvement in efficiency on an RTX 4090. FRPGS also supports the 4K resolution reconstruction of 60 frames simultaneously.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 9\",\"pages\":\"9119-9131\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2025-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10947553/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10947553/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
FRPGS: Fast, Robust, and Photorealistic Monocular Dynamic Scene Reconstruction With Deformable 3D Gaussians
Dynamic reconstruction technology presents significant promise for applications in visual and interactive fields. Current techniques utilizing 3D Gaussian Splatting show favorable results and fast reconstruction speed. However, as scene expanding, using individual Gaussian structure 1) leads to instability in large-scale dynamic reconstruction, marked by abrupt deformation, and 2) the heuristic densification of individuals suffers significant redundancy. Tackling these issues, we propose a jointed Gaussian representation method named FRPGS, which learns the global information and the deformation using center Gaussians and generates the neural Gaussians around them for local detail. Specifically, FRPGS employs center Gaussians initialized from point clouds, which are learned with a deformation field for representing global relationships and dynamic motion over time. Then, for each center Gaussian, attribute networks generate neural Gaussians that move under the linked center Gaussian driving, thereby ensuring structural integrity during movement within this joint-based representation. Finally, to reduce Gaussian redundancy, a densification strategy is developed based on the average cumulative gradient of the associated neural Gaussians, imposing strict limits on the growing of center Gaussians without compromising accuracy. Additionally, we established a large-scale dynamic indoor dataset at the MuLong Laboratory of ZTE Corporation. Evaluations demonstrate that FRPGS significantly outperforms state-of-the-art methods in both training efficiency and reconstruction quality, achieving over a 50% (up to 74%) improvement in efficiency on an RTX 4090. FRPGS also supports the 4K resolution reconstruction of 60 frames simultaneously.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.