Yikang Shi , Xin Zhan , Yaqian Li , Zhongqiang Wu , Wenming Zhang , Haibin Li
{"title":"循环cfm:一个用于工业环境中鲁棒多模态异常检测的无监督框架","authors":"Yikang Shi , Xin Zhan , Yaqian Li , Zhongqiang Wu , Wenming Zhang , Haibin Li","doi":"10.1016/j.eswa.2025.129745","DOIUrl":null,"url":null,"abstract":"<div><div>Industrial multimodal anomaly detection is confronted with three pivotal challenges: cross-modal feature drift, noise sensitivity, and modality imbalance. To address these issues, we propose Cycle-Consistent Cross-Modal Feature Mapping (Cycle-CFM), an unsupervised framework that integrates cycle-consistent cross-modal mapping with channel-attention-guided adaptive loss weighting. Cycle-CFM establishes bidirectional feature alignment between RGB and 3D modalities via reversible cycle mappings, yielding consistent representations robust to vibration and depth noise. To further mitigate dynamic interferences such as illumination variations, we introduce a joint optimization strategy that combines cross-consistency and cycle-consistency losses. Experimental results on our self-constructed <strong>SteelDefect-3D-AD</strong> dataset demonstrate that Cycle-CFM achieves an <strong>AUPRO@1 %</strong> of 0.371, outperforming state-of-the-art methods by 17–45 %. It also attains a pixel-level AUROC (P-AUROC) of 0.991 and an image-level AUROC (I-AUROC) of 0.998. On the public <strong>MVTec 3D-AD</strong> benchmark, Cycle-CFM reaches a mean P-AUROC of 0.960 and improves accuracy by 37.5 % for elongated anomalies. With a runtime of 11.03 FPS and 469.52 MB of parameters, the model highlights both its effectiveness and deployability for real-time industrial inspection.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129745"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cycle-CFM: An unsupervised framework for robust multimodal anomaly detection in industrial settings\",\"authors\":\"Yikang Shi , Xin Zhan , Yaqian Li , Zhongqiang Wu , Wenming Zhang , Haibin Li\",\"doi\":\"10.1016/j.eswa.2025.129745\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Industrial multimodal anomaly detection is confronted with three pivotal challenges: cross-modal feature drift, noise sensitivity, and modality imbalance. To address these issues, we propose Cycle-Consistent Cross-Modal Feature Mapping (Cycle-CFM), an unsupervised framework that integrates cycle-consistent cross-modal mapping with channel-attention-guided adaptive loss weighting. Cycle-CFM establishes bidirectional feature alignment between RGB and 3D modalities via reversible cycle mappings, yielding consistent representations robust to vibration and depth noise. To further mitigate dynamic interferences such as illumination variations, we introduce a joint optimization strategy that combines cross-consistency and cycle-consistency losses. Experimental results on our self-constructed <strong>SteelDefect-3D-AD</strong> dataset demonstrate that Cycle-CFM achieves an <strong>AUPRO@1 %</strong> of 0.371, outperforming state-of-the-art methods by 17–45 %. It also attains a pixel-level AUROC (P-AUROC) of 0.991 and an image-level AUROC (I-AUROC) of 0.998. On the public <strong>MVTec 3D-AD</strong> benchmark, Cycle-CFM reaches a mean P-AUROC of 0.960 and improves accuracy by 37.5 % for elongated anomalies. With a runtime of 11.03 FPS and 469.52 MB of parameters, the model highlights both its effectiveness and deployability for real-time industrial inspection.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"298 \",\"pages\":\"Article 129745\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425033603\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425033603","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Cycle-CFM: An unsupervised framework for robust multimodal anomaly detection in industrial settings
Industrial multimodal anomaly detection is confronted with three pivotal challenges: cross-modal feature drift, noise sensitivity, and modality imbalance. To address these issues, we propose Cycle-Consistent Cross-Modal Feature Mapping (Cycle-CFM), an unsupervised framework that integrates cycle-consistent cross-modal mapping with channel-attention-guided adaptive loss weighting. Cycle-CFM establishes bidirectional feature alignment between RGB and 3D modalities via reversible cycle mappings, yielding consistent representations robust to vibration and depth noise. To further mitigate dynamic interferences such as illumination variations, we introduce a joint optimization strategy that combines cross-consistency and cycle-consistency losses. Experimental results on our self-constructed SteelDefect-3D-AD dataset demonstrate that Cycle-CFM achieves an AUPRO@1 % of 0.371, outperforming state-of-the-art methods by 17–45 %. It also attains a pixel-level AUROC (P-AUROC) of 0.991 and an image-level AUROC (I-AUROC) of 0.998. On the public MVTec 3D-AD benchmark, Cycle-CFM reaches a mean P-AUROC of 0.960 and improves accuracy by 37.5 % for elongated anomalies. With a runtime of 11.03 FPS and 469.52 MB of parameters, the model highlights both its effectiveness and deployability for real-time industrial inspection.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.