{"title":"Multi-modal background-aware for defect semantic segmentation with limited data","authors":"Dexing Shan, Yunzhou Zhang, Shitong Liu","doi":"10.1007/s10845-024-02373-8","DOIUrl":null,"url":null,"abstract":"<p>Visual defect detection is widely used in intelligent manufacturing to achieve intelligent detection of product quality. Two main challenges remain in industrial applications. One is the scarcity of defect samples and the other is the weak texture variation of industrial defects. The above problems lead to the application of RGB image-based industrial defect segmentation. To this end, we propose a multi-modal background-aware network (MMBA-Net) for few-shot defect (2D+3D) segmentation with limited data, which can segment texture and structural defects in unseen and seen domains (objects). To synthesize the perception capabilities of different imaging conditions, MMBA-Net exploits the point cloud to provide spatial information for the RGB images. Furthermore, we found that background regions are perceptually consistent within an industrial image, which can be leveraged to discriminate between foreground and background regions. To implement this idea, we model correlation learning between multi-modal query samples and multi-modal normal (defect-free) samples as an optimal transport problem, establishing robust multi-modal background correlations between query and normal samples across different modalities. Experiments were conducted on real-world industrial products and food datasets, demonstrating that the proposed method can perform effective base learning and meta-learning on a small number of defective samples (approximately 15–25 defective training samples) to achieve effective segmentation of defects in the seen and unseen domains.</p>","PeriodicalId":16193,"journal":{"name":"Journal of Intelligent Manufacturing","volume":"20 1","pages":""},"PeriodicalIF":5.9000,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Manufacturing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s10845-024-02373-8","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Visual defect detection is widely used in intelligent manufacturing to achieve intelligent detection of product quality. Two main challenges remain in industrial applications. One is the scarcity of defect samples and the other is the weak texture variation of industrial defects. The above problems lead to the application of RGB image-based industrial defect segmentation. To this end, we propose a multi-modal background-aware network (MMBA-Net) for few-shot defect (2D+3D) segmentation with limited data, which can segment texture and structural defects in unseen and seen domains (objects). To synthesize the perception capabilities of different imaging conditions, MMBA-Net exploits the point cloud to provide spatial information for the RGB images. Furthermore, we found that background regions are perceptually consistent within an industrial image, which can be leveraged to discriminate between foreground and background regions. To implement this idea, we model correlation learning between multi-modal query samples and multi-modal normal (defect-free) samples as an optimal transport problem, establishing robust multi-modal background correlations between query and normal samples across different modalities. Experiments were conducted on real-world industrial products and food datasets, demonstrating that the proposed method can perform effective base learning and meta-learning on a small number of defective samples (approximately 15–25 defective training samples) to achieve effective segmentation of defects in the seen and unseen domains.
期刊介绍:
The Journal of Nonlinear Engineering aims to be a platform for sharing original research results in theoretical, experimental, practical, and applied nonlinear phenomena within engineering. It serves as a forum to exchange ideas and applications of nonlinear problems across various engineering disciplines. Articles are considered for publication if they explore nonlinearities in engineering systems, offering realistic mathematical modeling, utilizing nonlinearity for new designs, stabilizing systems, understanding system behavior through nonlinearity, optimizing systems based on nonlinear interactions, and developing algorithms to harness and leverage nonlinear elements.