{"title":"用于自然语言引导的CAD程序合成的多模态数据集","authors":"Chaofan Lv, Jinsong Bao","doi":"10.1016/j.cad.2025.103926","DOIUrl":null,"url":null,"abstract":"<div><div>While large language models (LLMs) have demonstrated remarkable success in general-purpose code generation, their application in computer-aided design (CAD) program synthesis remains constrained by the scarcity of high-quality natural language-annotated datasets. To address this challenge, we propose CADInstruct, a novel approach aimed at constructing a multimodal CAD instruction dataset to enhance the CAD program synthesis capabilities of LLMs. First, we introduce a parametric modification module for modeling sequences, which extracts geometric constraints and critical dimensions from sketches, transforming CAD construction sequences into design-intent-oriented instructions. Second, we incorporate a shape semantic recognition module that leverages model names and visually enriched rendered views to generate precise shape descriptions using multimodal large models, enabling accurate semantic representation of complex geometries. Lastly, a modeling instruction semantic alignment module utilizes the extracted shape descriptions and modeling instructions to generate hierarchical natural language descriptions, encompassing geometric forms and detailed modeling steps, ensuring consistency between textual descriptions and CAD instructions. We fine-tuned the Qwen2.5-Coder-7B model using the CADInstruct dataset to evaluate the effectiveness of this framework. Experimental results demonstrated its capability to significantly enhance CAD program synthesis. The code and dataset will be made publicly available at <span><span>https://github.com/dxlcf/CADInstruct</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50632,"journal":{"name":"Computer-Aided Design","volume":"188 ","pages":"Article 103926"},"PeriodicalIF":3.0000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CADInstruct: A multimodal dataset for natural language-guided CAD program synthesis\",\"authors\":\"Chaofan Lv, Jinsong Bao\",\"doi\":\"10.1016/j.cad.2025.103926\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>While large language models (LLMs) have demonstrated remarkable success in general-purpose code generation, their application in computer-aided design (CAD) program synthesis remains constrained by the scarcity of high-quality natural language-annotated datasets. To address this challenge, we propose CADInstruct, a novel approach aimed at constructing a multimodal CAD instruction dataset to enhance the CAD program synthesis capabilities of LLMs. First, we introduce a parametric modification module for modeling sequences, which extracts geometric constraints and critical dimensions from sketches, transforming CAD construction sequences into design-intent-oriented instructions. Second, we incorporate a shape semantic recognition module that leverages model names and visually enriched rendered views to generate precise shape descriptions using multimodal large models, enabling accurate semantic representation of complex geometries. Lastly, a modeling instruction semantic alignment module utilizes the extracted shape descriptions and modeling instructions to generate hierarchical natural language descriptions, encompassing geometric forms and detailed modeling steps, ensuring consistency between textual descriptions and CAD instructions. We fine-tuned the Qwen2.5-Coder-7B model using the CADInstruct dataset to evaluate the effectiveness of this framework. Experimental results demonstrated its capability to significantly enhance CAD program synthesis. The code and dataset will be made publicly available at <span><span>https://github.com/dxlcf/CADInstruct</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50632,\"journal\":{\"name\":\"Computer-Aided Design\",\"volume\":\"188 \",\"pages\":\"Article 103926\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer-Aided Design\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010448525000879\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer-Aided Design","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010448525000879","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
CADInstruct: A multimodal dataset for natural language-guided CAD program synthesis
While large language models (LLMs) have demonstrated remarkable success in general-purpose code generation, their application in computer-aided design (CAD) program synthesis remains constrained by the scarcity of high-quality natural language-annotated datasets. To address this challenge, we propose CADInstruct, a novel approach aimed at constructing a multimodal CAD instruction dataset to enhance the CAD program synthesis capabilities of LLMs. First, we introduce a parametric modification module for modeling sequences, which extracts geometric constraints and critical dimensions from sketches, transforming CAD construction sequences into design-intent-oriented instructions. Second, we incorporate a shape semantic recognition module that leverages model names and visually enriched rendered views to generate precise shape descriptions using multimodal large models, enabling accurate semantic representation of complex geometries. Lastly, a modeling instruction semantic alignment module utilizes the extracted shape descriptions and modeling instructions to generate hierarchical natural language descriptions, encompassing geometric forms and detailed modeling steps, ensuring consistency between textual descriptions and CAD instructions. We fine-tuned the Qwen2.5-Coder-7B model using the CADInstruct dataset to evaluate the effectiveness of this framework. Experimental results demonstrated its capability to significantly enhance CAD program synthesis. The code and dataset will be made publicly available at https://github.com/dxlcf/CADInstruct.
期刊介绍:
Computer-Aided Design is a leading international journal that provides academia and industry with key papers on research and developments in the application of computers to design.
Computer-Aided Design invites papers reporting new research, as well as novel or particularly significant applications, within a wide range of topics, spanning all stages of design process from concept creation to manufacture and beyond.