Xinyu Hu , Haijian Shao , Xing Deng , Yingtao Jiang , Fei Wang
{"title":"Integrating frequency limitation and feature refinement for robust 3D Gaussian segmentation","authors":"Xinyu Hu , Haijian Shao , Xing Deng , Yingtao Jiang , Fei Wang","doi":"10.1016/j.compeleceng.2025.110239","DOIUrl":null,"url":null,"abstract":"<div><div>3D scene segmentation is a core challenge in computer vision, aiming to precisely and efficiently extract targets from complex 3D environments. Despite some research progress, existing methods struggle with high-frequency information in complex, multi-scale and multi-view scenarios, facing issues like hard-to-track high-frequency data, high resource consumption, low segmentation accuracy, and complex user interactions, which limit the technology’s practical use and development. To address the above problems, this paper proposes a robust 3D Gaussian segmentation method (IFFSG) that integrates frequency restriction and feature refinement. By innovating a frequency-adaptive sampling constraint strategy, a 3D frequency modulation filter is constructed, dynamically adjusting the frequency of the 3D Gaussian elements according to multi-view input constraints. The highest frequency of the reconstructed 3D Gaussian scene is strictly confined within the sampling frequency range of the input views. Accurately matching the actual frequency and sampling frequency of the scene, effectively avoiding artifacts caused by scale changes, and significantly improving the accuracy and quality of scene reconstruction in terms of geometric structure and texture details. This provides accurate and stable basic data for aligning 2D and 3D features in subsequent segmentation. Additionally, this paper introduces a robust multi-view feature alignment strategy, using the advanced segmentation capability of SAM to guide the training of 3D features and promote the alignment of similar features in 3D space. This strategy promotes close alignment of similar features in 3D space, greatly enhancing the compactness and consistency of the features. During segmentation, the model relies on this optimized feature to more sensitively and accurately identify object boundaries and structures, providing fine and reliable data support for subsequent scene understanding and analysis. To validate the effectiveness of the proposed method, experiments are conducted on several challenging datasets. The results show that this method achieves near-real-time segmentation speeds with minimal user input, significantly outperforming existing techniques. On the NVOS dataset, its accuracy reaches SOTA level while maintaining a near-real-time inference time of 0.2 s. In terms of image quality assessment, we obtained SSIM of 0.801, PSNR of 20.04, and LPIPS of 0.173.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"123 ","pages":"Article 110239"},"PeriodicalIF":4.0000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S004579062500182X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
3D scene segmentation is a core challenge in computer vision, aiming to precisely and efficiently extract targets from complex 3D environments. Despite some research progress, existing methods struggle with high-frequency information in complex, multi-scale and multi-view scenarios, facing issues like hard-to-track high-frequency data, high resource consumption, low segmentation accuracy, and complex user interactions, which limit the technology’s practical use and development. To address the above problems, this paper proposes a robust 3D Gaussian segmentation method (IFFSG) that integrates frequency restriction and feature refinement. By innovating a frequency-adaptive sampling constraint strategy, a 3D frequency modulation filter is constructed, dynamically adjusting the frequency of the 3D Gaussian elements according to multi-view input constraints. The highest frequency of the reconstructed 3D Gaussian scene is strictly confined within the sampling frequency range of the input views. Accurately matching the actual frequency and sampling frequency of the scene, effectively avoiding artifacts caused by scale changes, and significantly improving the accuracy and quality of scene reconstruction in terms of geometric structure and texture details. This provides accurate and stable basic data for aligning 2D and 3D features in subsequent segmentation. Additionally, this paper introduces a robust multi-view feature alignment strategy, using the advanced segmentation capability of SAM to guide the training of 3D features and promote the alignment of similar features in 3D space. This strategy promotes close alignment of similar features in 3D space, greatly enhancing the compactness and consistency of the features. During segmentation, the model relies on this optimized feature to more sensitively and accurately identify object boundaries and structures, providing fine and reliable data support for subsequent scene understanding and analysis. To validate the effectiveness of the proposed method, experiments are conducted on several challenging datasets. The results show that this method achieves near-real-time segmentation speeds with minimal user input, significantly outperforming existing techniques. On the NVOS dataset, its accuracy reaches SOTA level while maintaining a near-real-time inference time of 0.2 s. In terms of image quality assessment, we obtained SSIM of 0.801, PSNR of 20.04, and LPIPS of 0.173.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.