{"title":"Target Before Shooting: Accurate Anomaly Detection and Localization Under One Millisecond via Cascade Patch Retrieval","authors":"Hanxi Li;Jianfei Hu;Bo Li;Hao Chen;Yongbin Zheng;Chunhua Shen","doi":"10.1109/TIP.2024.3448263","DOIUrl":null,"url":null,"abstract":"In this work, by re-examining the “matching” nature of Anomaly Detection (AD), we propose a novel AD framework that simultaneously enjoys new records of AD accuracy and dramatically high running speed. In this framework, the anomaly detection problem is solved via a cascade patch retrieval procedure that retrieves the nearest neighbors for each test image patch in a coarse-to-fine fashion. Given a test sample, the top-K most similar training images are first selected based on a robust histogram matching process. Secondly, the nearest neighbor of each test patch is retrieved over the similar geometrical locations on those “most similar images”, by using a carefully trained local metric. Finally, the anomaly score of each test image patch is calculated based on the distance to its “nearest neighbor” and the “non-background” probability. The proposed method is termed “Cascade Patch Retrieval” (CPR) in this work. Different from the previous patch-matching-based AD algorithms, CPR selects proper “targets” (reference images and patches) before “shooting” (patch-matching). On the well-acknowledged MVTec AD, BTAD and MVTec-3D AD datasets, the proposed algorithm consistently outperforms all the comparing SOTA methods by remarkable margins, measured by various AD metrics. Furthermore, CPR is extremely efficient. It runs at the speed of 113 FPS with the standard setting while its simplified version only requires less than 1 ms to process an image at the cost of a trivial accuracy drop. The code of CPR is available at \n<uri>https://github.com/flyinghu123/CPR</uri>\n.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5606-5621"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10678861/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this work, by re-examining the “matching” nature of Anomaly Detection (AD), we propose a novel AD framework that simultaneously enjoys new records of AD accuracy and dramatically high running speed. In this framework, the anomaly detection problem is solved via a cascade patch retrieval procedure that retrieves the nearest neighbors for each test image patch in a coarse-to-fine fashion. Given a test sample, the top-K most similar training images are first selected based on a robust histogram matching process. Secondly, the nearest neighbor of each test patch is retrieved over the similar geometrical locations on those “most similar images”, by using a carefully trained local metric. Finally, the anomaly score of each test image patch is calculated based on the distance to its “nearest neighbor” and the “non-background” probability. The proposed method is termed “Cascade Patch Retrieval” (CPR) in this work. Different from the previous patch-matching-based AD algorithms, CPR selects proper “targets” (reference images and patches) before “shooting” (patch-matching). On the well-acknowledged MVTec AD, BTAD and MVTec-3D AD datasets, the proposed algorithm consistently outperforms all the comparing SOTA methods by remarkable margins, measured by various AD metrics. Furthermore, CPR is extremely efficient. It runs at the speed of 113 FPS with the standard setting while its simplified version only requires less than 1 ms to process an image at the cost of a trivial accuracy drop. The code of CPR is available at
https://github.com/flyinghu123/CPR
.