Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li
{"title":"GS-Net:可通用的即插即用 3D 高斯拼接模块","authors":"Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li","doi":"arxiv-2409.11307","DOIUrl":null,"url":null,"abstract":"3D Gaussian Splatting (3DGS) integrates the strengths of primitive-based\nrepresentations and volumetric rendering techniques, enabling real-time,\nhigh-quality rendering. However, 3DGS models typically overfit to single-scene\ntraining and are highly sensitive to the initialization of Gaussian ellipsoids,\nheuristically derived from Structure from Motion (SfM) point clouds, which\nlimits both generalization and practicality. To address these limitations, we\npropose GS-Net, a generalizable, plug-and-play 3DGS module that densifies\nGaussian ellipsoids from sparse SfM point clouds, enhancing geometric structure\nrepresentation. To the best of our knowledge, GS-Net is the first plug-and-play\n3DGS module with cross-scene generalization capabilities. Additionally, we\nintroduce the CARLA-NVS dataset, which incorporates additional camera\nviewpoints to thoroughly evaluate reconstruction and rendering quality.\nExtensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNR\nimprovement of 2.08 dB for conventional viewpoints and 1.86 dB for novel\nviewpoints, confirming the method's effectiveness and robustness.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module\",\"authors\":\"Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li\",\"doi\":\"arxiv-2409.11307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"3D Gaussian Splatting (3DGS) integrates the strengths of primitive-based\\nrepresentations and volumetric rendering techniques, enabling real-time,\\nhigh-quality rendering. However, 3DGS models typically overfit to single-scene\\ntraining and are highly sensitive to the initialization of Gaussian ellipsoids,\\nheuristically derived from Structure from Motion (SfM) point clouds, which\\nlimits both generalization and practicality. To address these limitations, we\\npropose GS-Net, a generalizable, plug-and-play 3DGS module that densifies\\nGaussian ellipsoids from sparse SfM point clouds, enhancing geometric structure\\nrepresentation. To the best of our knowledge, GS-Net is the first plug-and-play\\n3DGS module with cross-scene generalization capabilities. Additionally, we\\nintroduce the CARLA-NVS dataset, which incorporates additional camera\\nviewpoints to thoroughly evaluate reconstruction and rendering quality.\\nExtensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNR\\nimprovement of 2.08 dB for conventional viewpoints and 1.86 dB for novel\\nviewpoints, confirming the method's effectiveness and robustness.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module
3D Gaussian Splatting (3DGS) integrates the strengths of primitive-based
representations and volumetric rendering techniques, enabling real-time,
high-quality rendering. However, 3DGS models typically overfit to single-scene
training and are highly sensitive to the initialization of Gaussian ellipsoids,
heuristically derived from Structure from Motion (SfM) point clouds, which
limits both generalization and practicality. To address these limitations, we
propose GS-Net, a generalizable, plug-and-play 3DGS module that densifies
Gaussian ellipsoids from sparse SfM point clouds, enhancing geometric structure
representation. To the best of our knowledge, GS-Net is the first plug-and-play
3DGS module with cross-scene generalization capabilities. Additionally, we
introduce the CARLA-NVS dataset, which incorporates additional camera
viewpoints to thoroughly evaluate reconstruction and rendering quality.
Extensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNR
improvement of 2.08 dB for conventional viewpoints and 1.86 dB for novel
viewpoints, confirming the method's effectiveness and robustness.