Kshitij Nikhal, Nkiruka Uzuegbunam, Bridget Kennedy, B. Riggan
{"title":"基于无监督多部分注意力的RGB-IR人脸识别技术","authors":"Kshitij Nikhal, Nkiruka Uzuegbunam, Bridget Kennedy, B. Riggan","doi":"10.1109/CVPRW59228.2023.00039","DOIUrl":null,"url":null,"abstract":"Modern algorithms for RGB-IR facial recognition— a challenging problem where infrared probe images are matched with visible gallery images—leverage precise and accurate guidance from curated (i.e., labeled) data to bridge large spectral differences. However, supervised cross-spectral face recognition methods are often extremely sensitive due to over-fitting to labels, performing well in some settings but not in others. Moreover, when fine-tuning on data from additional settings, supervised cross-spectral face recognition are prone to catastrophic forgetting. Therefore, we propose a novel unsupervised framework for RGB-IR face recognition to minimize the cost and time inefficiencies pertaining to labeling large-scale, multispectral data required to train supervised cross-spectral recognition methods and to alleviate the effect of forgetting by removing over dependence on hard labels to bridge such large spectral differences. The proposed framework integrates an efficient backbone network architecture with part-based attention models, which collectively enhances common information between visible and infrared faces. Then, the framework is optimized using pseudo-labels and a new cross-spectral memory bank loss. This framework is evaluated on the ARL-VTF and TUFTS datasets, achieving 98.55% and 43.28% true accept rate, respectively. Additionally, we analyze effects of forgetting and show that our framework is less prone to these effects.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mitigating Catastrophic Interference using Unsupervised Multi-Part Attention for RGB-IR Face Recognition\",\"authors\":\"Kshitij Nikhal, Nkiruka Uzuegbunam, Bridget Kennedy, B. Riggan\",\"doi\":\"10.1109/CVPRW59228.2023.00039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern algorithms for RGB-IR facial recognition— a challenging problem where infrared probe images are matched with visible gallery images—leverage precise and accurate guidance from curated (i.e., labeled) data to bridge large spectral differences. However, supervised cross-spectral face recognition methods are often extremely sensitive due to over-fitting to labels, performing well in some settings but not in others. Moreover, when fine-tuning on data from additional settings, supervised cross-spectral face recognition are prone to catastrophic forgetting. Therefore, we propose a novel unsupervised framework for RGB-IR face recognition to minimize the cost and time inefficiencies pertaining to labeling large-scale, multispectral data required to train supervised cross-spectral recognition methods and to alleviate the effect of forgetting by removing over dependence on hard labels to bridge such large spectral differences. The proposed framework integrates an efficient backbone network architecture with part-based attention models, which collectively enhances common information between visible and infrared faces. Then, the framework is optimized using pseudo-labels and a new cross-spectral memory bank loss. This framework is evaluated on the ARL-VTF and TUFTS datasets, achieving 98.55% and 43.28% true accept rate, respectively. Additionally, we analyze effects of forgetting and show that our framework is less prone to these effects.\",\"PeriodicalId\":355438,\"journal\":{\"name\":\"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPRW59228.2023.00039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW59228.2023.00039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mitigating Catastrophic Interference using Unsupervised Multi-Part Attention for RGB-IR Face Recognition
Modern algorithms for RGB-IR facial recognition— a challenging problem where infrared probe images are matched with visible gallery images—leverage precise and accurate guidance from curated (i.e., labeled) data to bridge large spectral differences. However, supervised cross-spectral face recognition methods are often extremely sensitive due to over-fitting to labels, performing well in some settings but not in others. Moreover, when fine-tuning on data from additional settings, supervised cross-spectral face recognition are prone to catastrophic forgetting. Therefore, we propose a novel unsupervised framework for RGB-IR face recognition to minimize the cost and time inefficiencies pertaining to labeling large-scale, multispectral data required to train supervised cross-spectral recognition methods and to alleviate the effect of forgetting by removing over dependence on hard labels to bridge such large spectral differences. The proposed framework integrates an efficient backbone network architecture with part-based attention models, which collectively enhances common information between visible and infrared faces. Then, the framework is optimized using pseudo-labels and a new cross-spectral memory bank loss. This framework is evaluated on the ARL-VTF and TUFTS datasets, achieving 98.55% and 43.28% true accept rate, respectively. Additionally, we analyze effects of forgetting and show that our framework is less prone to these effects.