Yunfeng Ma;Min Liu;Shuai Jiang;Xueping Wang;Yuan Bian;Yaonan Wang
{"title":"基于前景校正的多上下文聚合网络的少镜头缺陷自动分割","authors":"Yunfeng Ma;Min Liu;Shuai Jiang;Xueping Wang;Yuan Bian;Yaonan Wang","doi":"10.1109/TASE.2025.3554987","DOIUrl":null,"url":null,"abstract":"State-of-the-art defect segmentation methods rely on sufficient training data and struggle to generalize to unseen categories. Few-Shot Semantic Segmentation (FSS) is introduced to specifically address these issues. However, existing FSS models still face two challenges in the industry. 1) Defects usually present as weak features, resulting in incomplete segmentation; 2) Severe background interference often leads to incorrect segmentation. To tackle these problems, we propose the Multi-Context Aggregation Network (MCANet). Specifically, we design a Cross-Layer Multi-Level Feature Aggregation Module (CMAM). CMAM effectively aggregates discretely distributed multi-level defect features across different layers and guides the query image to perceive defects from the pixel level, which avoids incomplete segmentation caused by weak features. Additionally, a Foreground Correction Module (FCM) is developed, which is equipped with a dedicated background predictor (BP) and a foreground corrector (FC). BP places more emphasis on learning features from backgrounds rather than defects. FC achieves efficient feature ensemble and further suppresses the backgrounds misidentified as defects in CMAM. They collaborate to prevent incorrect segmentation caused by background interference. Extensive experiments demonstrate the effectiveness of our method. We achieve state-of-the-art results on both FSSD-12, a public benchmark FSS dataset for strip steel, and FSS-AEB, an FSS dataset for aero-engine blades. Specifically, with 1/5 support images, we achieve 64.6%/65.6% mIoU on FSSD-12 and 55.0%/57.8% mIoU on FSS-AEB. Note to Practitioners—Surface defect segmentation has always been a hot topic in the industry. However, existing methods rely on sufficient training data and struggle to generalize to unseen categories, which significantly hinders the automation of defect segmentation. To address this problem, we propose MCANet for automated few-shot defect segmentation. It achieves effective segmentation for surface defects with limited data, even for unseen categories. Furthermore, MCANet achieves state-of-the-art results on two datasets from real-world industrial scenarios and also delivers significant improvements over the widely concerned large vision models. Finally, we integrate MCANet into an automated surface defect inspection platform consisting of an imaging system and a high-performance computing server for real-world performance validation.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"13777-13787"},"PeriodicalIF":6.4000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Context Aggregation Network With Foreground Correction for Automated Few-Shot Defect Segmentation\",\"authors\":\"Yunfeng Ma;Min Liu;Shuai Jiang;Xueping Wang;Yuan Bian;Yaonan Wang\",\"doi\":\"10.1109/TASE.2025.3554987\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"State-of-the-art defect segmentation methods rely on sufficient training data and struggle to generalize to unseen categories. Few-Shot Semantic Segmentation (FSS) is introduced to specifically address these issues. However, existing FSS models still face two challenges in the industry. 1) Defects usually present as weak features, resulting in incomplete segmentation; 2) Severe background interference often leads to incorrect segmentation. To tackle these problems, we propose the Multi-Context Aggregation Network (MCANet). Specifically, we design a Cross-Layer Multi-Level Feature Aggregation Module (CMAM). CMAM effectively aggregates discretely distributed multi-level defect features across different layers and guides the query image to perceive defects from the pixel level, which avoids incomplete segmentation caused by weak features. Additionally, a Foreground Correction Module (FCM) is developed, which is equipped with a dedicated background predictor (BP) and a foreground corrector (FC). BP places more emphasis on learning features from backgrounds rather than defects. FC achieves efficient feature ensemble and further suppresses the backgrounds misidentified as defects in CMAM. They collaborate to prevent incorrect segmentation caused by background interference. Extensive experiments demonstrate the effectiveness of our method. We achieve state-of-the-art results on both FSSD-12, a public benchmark FSS dataset for strip steel, and FSS-AEB, an FSS dataset for aero-engine blades. Specifically, with 1/5 support images, we achieve 64.6%/65.6% mIoU on FSSD-12 and 55.0%/57.8% mIoU on FSS-AEB. Note to Practitioners—Surface defect segmentation has always been a hot topic in the industry. However, existing methods rely on sufficient training data and struggle to generalize to unseen categories, which significantly hinders the automation of defect segmentation. To address this problem, we propose MCANet for automated few-shot defect segmentation. It achieves effective segmentation for surface defects with limited data, even for unseen categories. Furthermore, MCANet achieves state-of-the-art results on two datasets from real-world industrial scenarios and also delivers significant improvements over the widely concerned large vision models. Finally, we integrate MCANet into an automated surface defect inspection platform consisting of an imaging system and a high-performance computing server for real-world performance validation.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"13777-13787\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10942431/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10942431/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Multi-Context Aggregation Network With Foreground Correction for Automated Few-Shot Defect Segmentation
State-of-the-art defect segmentation methods rely on sufficient training data and struggle to generalize to unseen categories. Few-Shot Semantic Segmentation (FSS) is introduced to specifically address these issues. However, existing FSS models still face two challenges in the industry. 1) Defects usually present as weak features, resulting in incomplete segmentation; 2) Severe background interference often leads to incorrect segmentation. To tackle these problems, we propose the Multi-Context Aggregation Network (MCANet). Specifically, we design a Cross-Layer Multi-Level Feature Aggregation Module (CMAM). CMAM effectively aggregates discretely distributed multi-level defect features across different layers and guides the query image to perceive defects from the pixel level, which avoids incomplete segmentation caused by weak features. Additionally, a Foreground Correction Module (FCM) is developed, which is equipped with a dedicated background predictor (BP) and a foreground corrector (FC). BP places more emphasis on learning features from backgrounds rather than defects. FC achieves efficient feature ensemble and further suppresses the backgrounds misidentified as defects in CMAM. They collaborate to prevent incorrect segmentation caused by background interference. Extensive experiments demonstrate the effectiveness of our method. We achieve state-of-the-art results on both FSSD-12, a public benchmark FSS dataset for strip steel, and FSS-AEB, an FSS dataset for aero-engine blades. Specifically, with 1/5 support images, we achieve 64.6%/65.6% mIoU on FSSD-12 and 55.0%/57.8% mIoU on FSS-AEB. Note to Practitioners—Surface defect segmentation has always been a hot topic in the industry. However, existing methods rely on sufficient training data and struggle to generalize to unseen categories, which significantly hinders the automation of defect segmentation. To address this problem, we propose MCANet for automated few-shot defect segmentation. It achieves effective segmentation for surface defects with limited data, even for unseen categories. Furthermore, MCANet achieves state-of-the-art results on two datasets from real-world industrial scenarios and also delivers significant improvements over the widely concerned large vision models. Finally, we integrate MCANet into an automated surface defect inspection platform consisting of an imaging system and a high-performance computing server for real-world performance validation.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.