基于相关探测和体素级关注的多类别异常编辑网络用于无监督地表异常检测。

IF 13.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Image Processing Pub Date : 2025-09-16 DOI:10.1109/tip.2025.3607638

Ruifan Zhang,Hai-Miao Hu

{"title":"基于相关探测和体素级关注的多类别异常编辑网络用于无监督地表异常检测。","authors":"Ruifan Zhang,Hai-Miao Hu","doi":"10.1109/tip.2025.3607638","DOIUrl":null,"url":null,"abstract":"Developing a unified model for surface anomaly detection remains challenging due to significant variations across product categories. Recent feature editing methods, as a branch of image reconstruction, mitigate the over-generalization of auto-encoders that leads to accurate anomaly reconstruction. However, these methods are only suited for texture-category products and have significant limitations in being generalized to other categories. In this article, we propose a multi-category anomaly editing network with a dual-branch training approach: one branch processes defect-free images (normal branch), while the other handles synthetic anomaly images (anomaly branch). Specifically, the paired samples are first fed into the multi-category anomaly feature editing based auto-encoder (MCAFE-AE) to perform image reconstruction and inpainting. In the normal branch, we propose a dual-entropy constrained deep embedded clustering module (DEC-DECM) to promote a more compact and orderly distribution of normal latent features, while avoiding trivial clustering solutions. Based on the clustering results, we further design a patch-based adaptive thresholding (PAT) strategy to adaptively calculate the threshold representing the central boundary of the cluster center for each local patch, thereby enabling the model to detect anomalies. Then, in the anomaly branch, we propose a multi-category anomaly feature editing module (MCAFEM) to identify anomalies in synthetic images and apply a category-oriented feature editing strategy to transform detected anomaly features into normal ones, thereby suppressing the reconstruction of anomalies. After completing the image reconstruction and inpainting, the input images from both branches and their respective output images are concatenated and fed into the correlation exploration and voxel-level attention based prediction network (CEVA-Net) for anomaly segmentation. The network is integrated with our proposed correlation-dependency exploration and voxel-level attention refinement module (CDE-VARM) and generates precise anomaly maps under the guidance of the bidirectional-path feature fusion (BPFF) and deep supervised learning (DSL). Extensive experiments on three datasets show that our method achieves state-of-the-art performance.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"83 1","pages":""},"PeriodicalIF":13.7000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Multi-Category Anomaly Editing Network with Correlation Exploration and Voxel-level Attention for Unsupervised Surface Anomaly Detection.\",\"authors\":\"Ruifan Zhang,Hai-Miao Hu\",\"doi\":\"10.1109/tip.2025.3607638\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Developing a unified model for surface anomaly detection remains challenging due to significant variations across product categories. Recent feature editing methods, as a branch of image reconstruction, mitigate the over-generalization of auto-encoders that leads to accurate anomaly reconstruction. However, these methods are only suited for texture-category products and have significant limitations in being generalized to other categories. In this article, we propose a multi-category anomaly editing network with a dual-branch training approach: one branch processes defect-free images (normal branch), while the other handles synthetic anomaly images (anomaly branch). Specifically, the paired samples are first fed into the multi-category anomaly feature editing based auto-encoder (MCAFE-AE) to perform image reconstruction and inpainting. In the normal branch, we propose a dual-entropy constrained deep embedded clustering module (DEC-DECM) to promote a more compact and orderly distribution of normal latent features, while avoiding trivial clustering solutions. Based on the clustering results, we further design a patch-based adaptive thresholding (PAT) strategy to adaptively calculate the threshold representing the central boundary of the cluster center for each local patch, thereby enabling the model to detect anomalies. Then, in the anomaly branch, we propose a multi-category anomaly feature editing module (MCAFEM) to identify anomalies in synthetic images and apply a category-oriented feature editing strategy to transform detected anomaly features into normal ones, thereby suppressing the reconstruction of anomalies. After completing the image reconstruction and inpainting, the input images from both branches and their respective output images are concatenated and fed into the correlation exploration and voxel-level attention based prediction network (CEVA-Net) for anomaly segmentation. The network is integrated with our proposed correlation-dependency exploration and voxel-level attention refinement module (CDE-VARM) and generates precise anomaly maps under the guidance of the bidirectional-path feature fusion (BPFF) and deep supervised learning (DSL). Extensive experiments on three datasets show that our method achieves state-of-the-art performance.\",\"PeriodicalId\":13217,\"journal\":{\"name\":\"IEEE Transactions on Image Processing\",\"volume\":\"83 1\",\"pages\":\"\"},\"PeriodicalIF\":13.7000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tip.2025.3607638\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tip.2025.3607638","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

由于不同产品类别的差异很大，开发统一的地表异常检测模型仍然具有挑战性。近年来的特征编辑方法作为图像重建的一个分支，缓解了自编码器的过度泛化，从而导致准确的异常重建。然而，这些方法只适用于纹理类别的产品，在推广到其他类别时存在很大的局限性。本文提出了一种采用双分支训练方法的多类别异常编辑网络：一个分支处理无缺陷图像（正常分支），而另一个分支处理合成异常图像（异常分支）。具体来说，配对样本首先被输入到基于多类别异常特征编辑的自编码器（MCAFE-AE）中进行图像重建和喷漆。在正态分支中，我们提出了一种双熵约束的深度嵌入聚类模块（DEC-DECM），以促进正态潜在特征更紧凑有序的分布，同时避免了琐碎的聚类解决方案。在聚类结果的基础上，进一步设计了基于patch的自适应阈值（PAT）策略，对每个局部patch自适应计算代表聚类中心中心边界的阈值，从而使模型能够检测异常。然后，在异常分支中，我们提出了多类别异常特征编辑模块（MCAFEM）来识别合成图像中的异常，并采用面向类别的特征编辑策略将检测到的异常特征转换为正常特征，从而抑制异常的重建。在完成图像重建和修复后，将两个分支的输入图像和各自的输出图像进行连接，并输入到基于相关探索和体素级关注的预测网络（CEVA-Net）中进行异常分割。该网络集成了我们提出的相关依赖探索和体素级注意力细化模块（CDE-VARM），并在双向路径特征融合（BPFF）和深度监督学习（DSL）的指导下生成精确的异常图。在三个数据集上的大量实验表明，我们的方法达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Multi-Category Anomaly Editing Network with Correlation Exploration and Voxel-level Attention for Unsupervised Surface Anomaly Detection.

Developing a unified model for surface anomaly detection remains challenging due to significant variations across product categories. Recent feature editing methods, as a branch of image reconstruction, mitigate the over-generalization of auto-encoders that leads to accurate anomaly reconstruction. However, these methods are only suited for texture-category products and have significant limitations in being generalized to other categories. In this article, we propose a multi-category anomaly editing network with a dual-branch training approach: one branch processes defect-free images (normal branch), while the other handles synthetic anomaly images (anomaly branch). Specifically, the paired samples are first fed into the multi-category anomaly feature editing based auto-encoder (MCAFE-AE) to perform image reconstruction and inpainting. In the normal branch, we propose a dual-entropy constrained deep embedded clustering module (DEC-DECM) to promote a more compact and orderly distribution of normal latent features, while avoiding trivial clustering solutions. Based on the clustering results, we further design a patch-based adaptive thresholding (PAT) strategy to adaptively calculate the threshold representing the central boundary of the cluster center for each local patch, thereby enabling the model to detect anomalies. Then, in the anomaly branch, we propose a multi-category anomaly feature editing module (MCAFEM) to identify anomalies in synthetic images and apply a category-oriented feature editing strategy to transform detected anomaly features into normal ones, thereby suppressing the reconstruction of anomalies. After completing the image reconstruction and inpainting, the input images from both branches and their respective output images are concatenated and fed into the correlation exploration and voxel-level attention based prediction network (CEVA-Net) for anomaly segmentation. The network is integrated with our proposed correlation-dependency exploration and voxel-level attention refinement module (CDE-VARM) and generates precise anomaly maps under the guidance of the bidirectional-path feature fusion (BPFF) and deep supervised learning (DSL). Extensive experiments on three datasets show that our method achieves state-of-the-art performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Image Processing 工程技术-工程：电子与电气

CiteScore

20.90

自引率

6.60%

发文量

774

审稿时长

7.6 months

期刊介绍： The IEEE Transactions on Image Processing delves into groundbreaking theories, algorithms, and structures concerning the generation, acquisition, manipulation, transmission, scrutiny, and presentation of images, video, and multidimensional signals across diverse applications. Topics span mathematical, statistical, and perceptual aspects, encompassing modeling, representation, formation, coding, filtering, enhancement, restoration, rendering, halftoning, search, and analysis of images, video, and multidimensional signals. Pertinent applications range from image and video communications to electronic imaging, biomedical imaging, image and video systems, and remote sensing.