Zhentong Xu , Long Zeng , Junli Zhao , Baodong Wang , Zhenkuan Pan , Yong-Jin Liu
{"title":"Sketch123: Multi-spectral channel cross attention for sketch-based 3D generation via diffusion models","authors":"Zhentong Xu , Long Zeng , Junli Zhao , Baodong Wang , Zhenkuan Pan , Yong-Jin Liu","doi":"10.1016/j.cad.2025.103896","DOIUrl":null,"url":null,"abstract":"<div><div>With the development of generative techniques, sketch-driven 3D reconstruction has gained substantial attention as an efficient 3D modeling technique. However, challenges remain in extracting detailed features from sketches, representing local geometric structures, and ensuring generated fidelity and stability. To address these issues, in this paper we propose a multi-spectral channel cross-attention model for sketch reconstruction, which leverages the complementary strengths of frequency and spatial domains to capture multi-level sketch features. Our method employs a two-stage diffusion generation mechanism, additionally, a Sparse Feature Enhancement Module (SFE) replaces traditional down-sampling, reducing feature loss and enhancing detail preservation and noise suppression through a Laplace voxel smoothing operator. The Wasserstein distance introduced and integrated as part of the loss function, stabilizes the generative process using optimal transport theory to support high-quality 3D model reconstruction. Extensive experiments verify that our model surpasses state-of-the-art methods in terms of generation accuracy, local control, and generalization ability, providing an efficient, precise solution for transforming sketches into 3D models.</div></div>","PeriodicalId":50632,"journal":{"name":"Computer-Aided Design","volume":"185 ","pages":"Article 103896"},"PeriodicalIF":3.0000,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer-Aided Design","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010448525000582","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of generative techniques, sketch-driven 3D reconstruction has gained substantial attention as an efficient 3D modeling technique. However, challenges remain in extracting detailed features from sketches, representing local geometric structures, and ensuring generated fidelity and stability. To address these issues, in this paper we propose a multi-spectral channel cross-attention model for sketch reconstruction, which leverages the complementary strengths of frequency and spatial domains to capture multi-level sketch features. Our method employs a two-stage diffusion generation mechanism, additionally, a Sparse Feature Enhancement Module (SFE) replaces traditional down-sampling, reducing feature loss and enhancing detail preservation and noise suppression through a Laplace voxel smoothing operator. The Wasserstein distance introduced and integrated as part of the loss function, stabilizes the generative process using optimal transport theory to support high-quality 3D model reconstruction. Extensive experiments verify that our model surpasses state-of-the-art methods in terms of generation accuracy, local control, and generalization ability, providing an efficient, precise solution for transforming sketches into 3D models.
期刊介绍:
Computer-Aided Design is a leading international journal that provides academia and industry with key papers on research and developments in the application of computers to design.
Computer-Aided Design invites papers reporting new research, as well as novel or particularly significant applications, within a wide range of topics, spanning all stages of design process from concept creation to manufacture and beyond.