Qingfeng Xu , Chao Zhang , Dongxu Ma , Jiacheng Li , Jiewu Leng , Guanghui Zhou
{"title":"基于跨模态变压器的智能工艺规划的自动化多模态工艺知识图谱构建","authors":"Qingfeng Xu , Chao Zhang , Dongxu Ma , Jiacheng Li , Jiewu Leng , Guanghui Zhou","doi":"10.1016/j.rcim.2025.103141","DOIUrl":null,"url":null,"abstract":"<div><div>Intelligent process planning is pivotal in modern manufacturing systems, enabling efficient, precise, and flexible production by optimizing resource allocation, enhancing machining accuracy, and shortening production cycles. Knowledge graphs integrate multi-source heterogeneous data to support this process, yet traditional single-modal approaches hinder the exploration of complex relationships in multimodal data, falling short of the needs for complex part planning. This paper examines machining features, the foundational units of process planning, and introduces an automatic construction method for a Multimodal Process Knowledge Graph (MPKG) tailored to intelligent process planning, powered by Cross-Modal Transformers. We developed the MF36 dataset, encompassing 36 machining features with 3D models, engineering views, and descriptive texts. A cross-modal framework integrating LERT-CRF and PA-ViT models automates the extraction and fusion of multimodal process knowledge, with PA-ViT’s pooling attention mechanism markedly boosting machining feature recognition accuracy. Experiments demonstrate superior performance over baselines, achieving F1 scores of 0.895 in entity extraction and 0.877 in image recognition. A case study validates the method’s reliability for precise process recommendations, providing fresh insights into advancing intelligent process planning.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"98 ","pages":"Article 103141"},"PeriodicalIF":11.4000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated multimodal process knowledge graph construction for intelligent process planning with Cross-Modal Transformers\",\"authors\":\"Qingfeng Xu , Chao Zhang , Dongxu Ma , Jiacheng Li , Jiewu Leng , Guanghui Zhou\",\"doi\":\"10.1016/j.rcim.2025.103141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Intelligent process planning is pivotal in modern manufacturing systems, enabling efficient, precise, and flexible production by optimizing resource allocation, enhancing machining accuracy, and shortening production cycles. Knowledge graphs integrate multi-source heterogeneous data to support this process, yet traditional single-modal approaches hinder the exploration of complex relationships in multimodal data, falling short of the needs for complex part planning. This paper examines machining features, the foundational units of process planning, and introduces an automatic construction method for a Multimodal Process Knowledge Graph (MPKG) tailored to intelligent process planning, powered by Cross-Modal Transformers. We developed the MF36 dataset, encompassing 36 machining features with 3D models, engineering views, and descriptive texts. A cross-modal framework integrating LERT-CRF and PA-ViT models automates the extraction and fusion of multimodal process knowledge, with PA-ViT’s pooling attention mechanism markedly boosting machining feature recognition accuracy. Experiments demonstrate superior performance over baselines, achieving F1 scores of 0.895 in entity extraction and 0.877 in image recognition. A case study validates the method’s reliability for precise process recommendations, providing fresh insights into advancing intelligent process planning.</div></div>\",\"PeriodicalId\":21452,\"journal\":{\"name\":\"Robotics and Computer-integrated Manufacturing\",\"volume\":\"98 \",\"pages\":\"Article 103141\"},\"PeriodicalIF\":11.4000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics and Computer-integrated Manufacturing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0736584525001954\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584525001954","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Automated multimodal process knowledge graph construction for intelligent process planning with Cross-Modal Transformers
Intelligent process planning is pivotal in modern manufacturing systems, enabling efficient, precise, and flexible production by optimizing resource allocation, enhancing machining accuracy, and shortening production cycles. Knowledge graphs integrate multi-source heterogeneous data to support this process, yet traditional single-modal approaches hinder the exploration of complex relationships in multimodal data, falling short of the needs for complex part planning. This paper examines machining features, the foundational units of process planning, and introduces an automatic construction method for a Multimodal Process Knowledge Graph (MPKG) tailored to intelligent process planning, powered by Cross-Modal Transformers. We developed the MF36 dataset, encompassing 36 machining features with 3D models, engineering views, and descriptive texts. A cross-modal framework integrating LERT-CRF and PA-ViT models automates the extraction and fusion of multimodal process knowledge, with PA-ViT’s pooling attention mechanism markedly boosting machining feature recognition accuracy. Experiments demonstrate superior performance over baselines, achieving F1 scores of 0.895 in entity extraction and 0.877 in image recognition. A case study validates the method’s reliability for precise process recommendations, providing fresh insights into advancing intelligent process planning.
期刊介绍:
The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.