OrthoCAD-322K：一种跨模态方法，用于在开发的大规模数据集上使用基于图形的框架从正射视图中检索3D CAD模型

IF 2.8 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk Pub Date : 2025-08-10 DOI:10.1016/j.cag.2025.104357

Swapnil Nagnath Mahajan , Karthik Krishna M. , Ramanathan Muthuganapathy

{"title":"OrthoCAD-322K：一种跨模态方法，用于在开发的大规模数据集上使用基于图形的框架从正射视图中检索3D CAD模型","authors":"Swapnil Nagnath Mahajan , Karthik Krishna M. , Ramanathan Muthuganapathy","doi":"10.1016/j.cag.2025.104357","DOIUrl":null,"url":null,"abstract":"<div><div>Despite the widespread adoption of 3D CAD systems, 2D orthographic drawings remain integral to engineering workflows. However, millions of legacy drawings lack corresponding 3D models, hindering their integration into modern simulation, manufacturing, and digital twin systems. Existing methods for 2D to 3D CAD retrieval often fall short of meeting the structural precision required for engineering-grade drawings. We propose a cross-modal retrieval framework that aligns vector-based 2D DXF (Drawing Exchange Format) views with 3D CAD models using contrastive learning. Our architecture integrates a Graphormer-based encoder for 2D input and a PointNet-based encoder for 3D CAD models. We introduce a novel proximity-based spatial encoding to enhance structural precision and robustness across varying view configurations. Using the filtered subset (<span><math><mo>∼</mo></math></span>283K) of the newly developed large-scale dataset OrthoCAD-322K, extensive ablation and comparison studies demonstrate the robustness and generalization of the model in different input conditions and architectures. Source code is available at <span><span>https://github.com/Swapnil-Mahajan-MS/OrthoCAD-322K</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104357"},"PeriodicalIF":2.8000,"publicationDate":"2025-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OrthoCAD-322K: A cross-modal approach for retrieving 3D CAD models from orthographic views using a graph-based framework on a developed large-scale dataset\",\"authors\":\"Swapnil Nagnath Mahajan , Karthik Krishna M. , Ramanathan Muthuganapathy\",\"doi\":\"10.1016/j.cag.2025.104357\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Despite the widespread adoption of 3D CAD systems, 2D orthographic drawings remain integral to engineering workflows. However, millions of legacy drawings lack corresponding 3D models, hindering their integration into modern simulation, manufacturing, and digital twin systems. Existing methods for 2D to 3D CAD retrieval often fall short of meeting the structural precision required for engineering-grade drawings. We propose a cross-modal retrieval framework that aligns vector-based 2D DXF (Drawing Exchange Format) views with 3D CAD models using contrastive learning. Our architecture integrates a Graphormer-based encoder for 2D input and a PointNet-based encoder for 3D CAD models. We introduce a novel proximity-based spatial encoding to enhance structural precision and robustness across varying view configurations. Using the filtered subset (<span><math><mo>∼</mo></math></span>283K) of the newly developed large-scale dataset OrthoCAD-322K, extensive ablation and comparison studies demonstrate the robustness and generalization of the model in different input conditions and architectures. Source code is available at <span><span>https://github.com/Swapnil-Mahajan-MS/OrthoCAD-322K</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50628,\"journal\":{\"name\":\"Computers & Graphics-Uk\",\"volume\":\"132 \",\"pages\":\"Article 104357\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Graphics-Uk\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0097849325001980\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325001980","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

尽管3D CAD系统被广泛采用，但2D正射影图仍然是工程工作流程中不可或缺的一部分。然而，数以百万计的遗留图纸缺乏相应的3D模型，阻碍了它们与现代仿真、制造和数字孪生系统的集成。现有的二维到三维CAD检索方法往往不能满足工程级图纸的结构精度要求。我们提出了一个跨模态检索框架，该框架使用对比学习将基于矢量的2D DXF（绘图交换格式）视图与3D CAD模型对齐。我们的架构集成了一个基于graphhormer的2D输入编码器和一个基于pointnet的3D CAD模型编码器。我们引入了一种新的基于邻近度的空间编码来提高结构精度和鲁棒性。使用新开发的大型数据集OrthoCAD-322K的过滤子集（~ 283K），广泛的烧蚀和比较研究证明了该模型在不同输入条件和架构下的鲁棒性和泛化性。源代码可从https://github.com/Swapnil-Mahajan-MS/OrthoCAD-322K获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

OrthoCAD-322K: A cross-modal approach for retrieving 3D CAD models from orthographic views using a graph-based framework on a developed large-scale dataset

查看原文本刊更多论文

OrthoCAD-322K: A cross-modal approach for retrieving 3D CAD models from orthographic views using a graph-based framework on a developed large-scale dataset

Despite the widespread adoption of 3D CAD systems, 2D orthographic drawings remain integral to engineering workflows. However, millions of legacy drawings lack corresponding 3D models, hindering their integration into modern simulation, manufacturing, and digital twin systems. Existing methods for 2D to 3D CAD retrieval often fall short of meeting the structural precision required for engineering-grade drawings. We propose a cross-modal retrieval framework that aligns vector-based 2D DXF (Drawing Exchange Format) views with 3D CAD models using contrastive learning. Our architecture integrates a Graphormer-based encoder for 2D input and a PointNet-based encoder for 3D CAD models. We introduce a novel proximity-based spatial encoding to enhance structural precision and robustness across varying view configurations. Using the filtered subset (

\sim

283K) of the newly developed large-scale dataset OrthoCAD-322K, extensive ablation and comparison studies demonstrate the robustness and generalization of the model in different input conditions and architectures. Source code is available at https://github.com/Swapnil-Mahajan-MS/OrthoCAD-322K.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.