{"title":"卷积神经网络在野外的耳朵识别","authors":"Solange Ramos-Cooper, Guillermo Cámara Chávez","doi":"10.1109/CLEI53233.2021.9640083","DOIUrl":null,"url":null,"abstract":"Ear recognition has gained attention in recent years. The possibility of being captured from a distance, contactless, without the cooperation of the subject and not be affected by facial expressions makes ear recognition a captivating choice for surveillance and security applications, and even more in the current COVID-19 pandemic context where modalities like face recognition fail due to mouth and facial covering masks usage. Applying any deep learning (DL) algorithm usually demands a large amount of training data and appropriate network architectures, therefore we introduce a large-scale database and explore fine-tuning pre-trained convolutional neural networks (CNNs) looking for a robust representation of ear images taken under uncontrolled conditions. Taking advantage of the face recognition field, we built an ear dataset based on the VGGFace dataset and use the Mask-RCNN for ear detection. Besides, adapting the VGGFace model to the ear domain leads to a better performance than using a model trained for general image recognition. Experiments on the UERC dataset have shown that fine-tuning from a face recognition model and using a larger dataset leads to a significant improvement of around 9% compared to state-of-the-art methods on the ear recognition field. In addition, we have explored score-level fusion by combining matching scores of the fine-tuning models which leads to an improvement of around 4% more. Open-set and close-set experiments have been performed and evaluated using Rank-1 and Rank-5 recognition rate metrics.","PeriodicalId":6803,"journal":{"name":"2021 XLVII Latin American Computing Conference (CLEI)","volume":"29 1","pages":"1-10"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Ear Recognition In The Wild with Convolutional Neural Networks\",\"authors\":\"Solange Ramos-Cooper, Guillermo Cámara Chávez\",\"doi\":\"10.1109/CLEI53233.2021.9640083\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ear recognition has gained attention in recent years. The possibility of being captured from a distance, contactless, without the cooperation of the subject and not be affected by facial expressions makes ear recognition a captivating choice for surveillance and security applications, and even more in the current COVID-19 pandemic context where modalities like face recognition fail due to mouth and facial covering masks usage. Applying any deep learning (DL) algorithm usually demands a large amount of training data and appropriate network architectures, therefore we introduce a large-scale database and explore fine-tuning pre-trained convolutional neural networks (CNNs) looking for a robust representation of ear images taken under uncontrolled conditions. Taking advantage of the face recognition field, we built an ear dataset based on the VGGFace dataset and use the Mask-RCNN for ear detection. Besides, adapting the VGGFace model to the ear domain leads to a better performance than using a model trained for general image recognition. Experiments on the UERC dataset have shown that fine-tuning from a face recognition model and using a larger dataset leads to a significant improvement of around 9% compared to state-of-the-art methods on the ear recognition field. In addition, we have explored score-level fusion by combining matching scores of the fine-tuning models which leads to an improvement of around 4% more. Open-set and close-set experiments have been performed and evaluated using Rank-1 and Rank-5 recognition rate metrics.\",\"PeriodicalId\":6803,\"journal\":{\"name\":\"2021 XLVII Latin American Computing Conference (CLEI)\",\"volume\":\"29 1\",\"pages\":\"1-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 XLVII Latin American Computing Conference (CLEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLEI53233.2021.9640083\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 XLVII Latin American Computing Conference (CLEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLEI53233.2021.9640083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ear Recognition In The Wild with Convolutional Neural Networks
Ear recognition has gained attention in recent years. The possibility of being captured from a distance, contactless, without the cooperation of the subject and not be affected by facial expressions makes ear recognition a captivating choice for surveillance and security applications, and even more in the current COVID-19 pandemic context where modalities like face recognition fail due to mouth and facial covering masks usage. Applying any deep learning (DL) algorithm usually demands a large amount of training data and appropriate network architectures, therefore we introduce a large-scale database and explore fine-tuning pre-trained convolutional neural networks (CNNs) looking for a robust representation of ear images taken under uncontrolled conditions. Taking advantage of the face recognition field, we built an ear dataset based on the VGGFace dataset and use the Mask-RCNN for ear detection. Besides, adapting the VGGFace model to the ear domain leads to a better performance than using a model trained for general image recognition. Experiments on the UERC dataset have shown that fine-tuning from a face recognition model and using a larger dataset leads to a significant improvement of around 9% compared to state-of-the-art methods on the ear recognition field. In addition, we have explored score-level fusion by combining matching scores of the fine-tuning models which leads to an improvement of around 4% more. Open-set and close-set experiments have been performed and evaluated using Rank-1 and Rank-5 recognition rate metrics.