Yuxiao Zhang , Jin Wang , Yang Zhou , Senyun Jia , Zhi Zheng , Dongliang Zhang , Guodong Lu
{"title":"三维建模从一个单一的草图与多方面的语义理解","authors":"Yuxiao Zhang , Jin Wang , Yang Zhou , Senyun Jia , Zhi Zheng , Dongliang Zhang , Guodong Lu","doi":"10.1016/j.eswa.2025.129748","DOIUrl":null,"url":null,"abstract":"<div><div>This paper studies the problem of 3D shape generation from a single sketch. Prior works rely on directly extracted visual features of sketches as guidance for the generation process. However, the sparse visual cues and abstract nature of sketches, which are inherited in the guiding features, lead to semantic ambiguity and geometry incompleteness in the generated shapes, compromising accuracy. To address this, we propose MSU-3D, a diffusion-based framework for sketch-to-3D generation, leveraging <em>Multifaceted Semantic Understanding</em> to explicitly analyze the construction information of sketches from multiple facets before providing fine-grained guidance over 3D shape generation. Specifically, we decompose sketches through three interpretative facets (semantics, depth, and normal), introducing reasoning of three representations to capture 3D features from distinct perspectives: local components, basic 3D geometry, and 3D surface details. One step further, we propose a multifaceted perception module. It aggregates multifaceted feature representations and leverages local component features as a two-pronged guiding representation to jointly guide the perception of basic shapes and surface details. To ensure fine-grained control, the hierarchical perception strategy adaptively injects varying granularity of perception features at different stages of the 3D generation. Extensive experiments and comparisons with state-of-the-art methods on various complex posture datasets validate the effectiveness of our framework in mitigating semantic ambiguity and geometry incompleteness in 3D generation.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129748"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3D modeling from a single sketch with multifaceted semantic understanding\",\"authors\":\"Yuxiao Zhang , Jin Wang , Yang Zhou , Senyun Jia , Zhi Zheng , Dongliang Zhang , Guodong Lu\",\"doi\":\"10.1016/j.eswa.2025.129748\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper studies the problem of 3D shape generation from a single sketch. Prior works rely on directly extracted visual features of sketches as guidance for the generation process. However, the sparse visual cues and abstract nature of sketches, which are inherited in the guiding features, lead to semantic ambiguity and geometry incompleteness in the generated shapes, compromising accuracy. To address this, we propose MSU-3D, a diffusion-based framework for sketch-to-3D generation, leveraging <em>Multifaceted Semantic Understanding</em> to explicitly analyze the construction information of sketches from multiple facets before providing fine-grained guidance over 3D shape generation. Specifically, we decompose sketches through three interpretative facets (semantics, depth, and normal), introducing reasoning of three representations to capture 3D features from distinct perspectives: local components, basic 3D geometry, and 3D surface details. One step further, we propose a multifaceted perception module. It aggregates multifaceted feature representations and leverages local component features as a two-pronged guiding representation to jointly guide the perception of basic shapes and surface details. To ensure fine-grained control, the hierarchical perception strategy adaptively injects varying granularity of perception features at different stages of the 3D generation. Extensive experiments and comparisons with state-of-the-art methods on various complex posture datasets validate the effectiveness of our framework in mitigating semantic ambiguity and geometry incompleteness in 3D generation.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"298 \",\"pages\":\"Article 129748\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425033639\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425033639","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
3D modeling from a single sketch with multifaceted semantic understanding
This paper studies the problem of 3D shape generation from a single sketch. Prior works rely on directly extracted visual features of sketches as guidance for the generation process. However, the sparse visual cues and abstract nature of sketches, which are inherited in the guiding features, lead to semantic ambiguity and geometry incompleteness in the generated shapes, compromising accuracy. To address this, we propose MSU-3D, a diffusion-based framework for sketch-to-3D generation, leveraging Multifaceted Semantic Understanding to explicitly analyze the construction information of sketches from multiple facets before providing fine-grained guidance over 3D shape generation. Specifically, we decompose sketches through three interpretative facets (semantics, depth, and normal), introducing reasoning of three representations to capture 3D features from distinct perspectives: local components, basic 3D geometry, and 3D surface details. One step further, we propose a multifaceted perception module. It aggregates multifaceted feature representations and leverages local component features as a two-pronged guiding representation to jointly guide the perception of basic shapes and surface details. To ensure fine-grained control, the hierarchical perception strategy adaptively injects varying granularity of perception features at different stages of the 3D generation. Extensive experiments and comparisons with state-of-the-art methods on various complex posture datasets validate the effectiveness of our framework in mitigating semantic ambiguity and geometry incompleteness in 3D generation.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.