{"title":"基于相关探测和体素级关注的多类别异常编辑网络用于无监督地表异常检测。","authors":"Ruifan Zhang,Hai-Miao Hu","doi":"10.1109/tip.2025.3607638","DOIUrl":null,"url":null,"abstract":"Developing a unified model for surface anomaly detection remains challenging due to significant variations across product categories. Recent feature editing methods, as a branch of image reconstruction, mitigate the over-generalization of auto-encoders that leads to accurate anomaly reconstruction. However, these methods are only suited for texture-category products and have significant limitations in being generalized to other categories. In this article, we propose a multi-category anomaly editing network with a dual-branch training approach: one branch processes defect-free images (normal branch), while the other handles synthetic anomaly images (anomaly branch). Specifically, the paired samples are first fed into the multi-category anomaly feature editing based auto-encoder (MCAFE-AE) to perform image reconstruction and inpainting. In the normal branch, we propose a dual-entropy constrained deep embedded clustering module (DEC-DECM) to promote a more compact and orderly distribution of normal latent features, while avoiding trivial clustering solutions. Based on the clustering results, we further design a patch-based adaptive thresholding (PAT) strategy to adaptively calculate the threshold representing the central boundary of the cluster center for each local patch, thereby enabling the model to detect anomalies. Then, in the anomaly branch, we propose a multi-category anomaly feature editing module (MCAFEM) to identify anomalies in synthetic images and apply a category-oriented feature editing strategy to transform detected anomaly features into normal ones, thereby suppressing the reconstruction of anomalies. After completing the image reconstruction and inpainting, the input images from both branches and their respective output images are concatenated and fed into the correlation exploration and voxel-level attention based prediction network (CEVA-Net) for anomaly segmentation. The network is integrated with our proposed correlation-dependency exploration and voxel-level attention refinement module (CDE-VARM) and generates precise anomaly maps under the guidance of the bidirectional-path feature fusion (BPFF) and deep supervised learning (DSL). Extensive experiments on three datasets show that our method achieves state-of-the-art performance.","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"83 1","pages":""},"PeriodicalIF":13.7000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Multi-Category Anomaly Editing Network with Correlation Exploration and Voxel-level Attention for Unsupervised Surface Anomaly Detection.\",\"authors\":\"Ruifan Zhang,Hai-Miao Hu\",\"doi\":\"10.1109/tip.2025.3607638\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Developing a unified model for surface anomaly detection remains challenging due to significant variations across product categories. Recent feature editing methods, as a branch of image reconstruction, mitigate the over-generalization of auto-encoders that leads to accurate anomaly reconstruction. However, these methods are only suited for texture-category products and have significant limitations in being generalized to other categories. In this article, we propose a multi-category anomaly editing network with a dual-branch training approach: one branch processes defect-free images (normal branch), while the other handles synthetic anomaly images (anomaly branch). Specifically, the paired samples are first fed into the multi-category anomaly feature editing based auto-encoder (MCAFE-AE) to perform image reconstruction and inpainting. In the normal branch, we propose a dual-entropy constrained deep embedded clustering module (DEC-DECM) to promote a more compact and orderly distribution of normal latent features, while avoiding trivial clustering solutions. Based on the clustering results, we further design a patch-based adaptive thresholding (PAT) strategy to adaptively calculate the threshold representing the central boundary of the cluster center for each local patch, thereby enabling the model to detect anomalies. Then, in the anomaly branch, we propose a multi-category anomaly feature editing module (MCAFEM) to identify anomalies in synthetic images and apply a category-oriented feature editing strategy to transform detected anomaly features into normal ones, thereby suppressing the reconstruction of anomalies. After completing the image reconstruction and inpainting, the input images from both branches and their respective output images are concatenated and fed into the correlation exploration and voxel-level attention based prediction network (CEVA-Net) for anomaly segmentation. The network is integrated with our proposed correlation-dependency exploration and voxel-level attention refinement module (CDE-VARM) and generates precise anomaly maps under the guidance of the bidirectional-path feature fusion (BPFF) and deep supervised learning (DSL). Extensive experiments on three datasets show that our method achieves state-of-the-art performance.\",\"PeriodicalId\":13217,\"journal\":{\"name\":\"IEEE Transactions on Image Processing\",\"volume\":\"83 1\",\"pages\":\"\"},\"PeriodicalIF\":13.7000,\"publicationDate\":\"2025-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tip.2025.3607638\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tip.2025.3607638","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A Multi-Category Anomaly Editing Network with Correlation Exploration and Voxel-level Attention for Unsupervised Surface Anomaly Detection.
Developing a unified model for surface anomaly detection remains challenging due to significant variations across product categories. Recent feature editing methods, as a branch of image reconstruction, mitigate the over-generalization of auto-encoders that leads to accurate anomaly reconstruction. However, these methods are only suited for texture-category products and have significant limitations in being generalized to other categories. In this article, we propose a multi-category anomaly editing network with a dual-branch training approach: one branch processes defect-free images (normal branch), while the other handles synthetic anomaly images (anomaly branch). Specifically, the paired samples are first fed into the multi-category anomaly feature editing based auto-encoder (MCAFE-AE) to perform image reconstruction and inpainting. In the normal branch, we propose a dual-entropy constrained deep embedded clustering module (DEC-DECM) to promote a more compact and orderly distribution of normal latent features, while avoiding trivial clustering solutions. Based on the clustering results, we further design a patch-based adaptive thresholding (PAT) strategy to adaptively calculate the threshold representing the central boundary of the cluster center for each local patch, thereby enabling the model to detect anomalies. Then, in the anomaly branch, we propose a multi-category anomaly feature editing module (MCAFEM) to identify anomalies in synthetic images and apply a category-oriented feature editing strategy to transform detected anomaly features into normal ones, thereby suppressing the reconstruction of anomalies. After completing the image reconstruction and inpainting, the input images from both branches and their respective output images are concatenated and fed into the correlation exploration and voxel-level attention based prediction network (CEVA-Net) for anomaly segmentation. The network is integrated with our proposed correlation-dependency exploration and voxel-level attention refinement module (CDE-VARM) and generates precise anomaly maps under the guidance of the bidirectional-path feature fusion (BPFF) and deep supervised learning (DSL). Extensive experiments on three datasets show that our method achieves state-of-the-art performance.
期刊介绍:
The IEEE Transactions on Image Processing delves into groundbreaking theories, algorithms, and structures concerning the generation, acquisition, manipulation, transmission, scrutiny, and presentation of images, video, and multidimensional signals across diverse applications. Topics span mathematical, statistical, and perceptual aspects, encompassing modeling, representation, formation, coding, filtering, enhancement, restoration, rendering, halftoning, search, and analysis of images, video, and multidimensional signals. Pertinent applications range from image and video communications to electronic imaging, biomedical imaging, image and video systems, and remote sensing.