{"title":"Region of Interest Synthesis using Image-to-Image Translation for ear recognition","authors":"Yacine Khaldi, Amir Benzaoui","doi":"10.1109/ICAASE51408.2020.9380127","DOIUrl":null,"url":null,"abstract":"Most ear recognition techniques use cropped ear images, as they are, with backgrounds, hair, part of the face or neck skin, and even cloths. These non-ear pixels of the image can negatively affect the classification decision. To avoid that, and to make sure that the classifier depends on ear pixels only, we propose using a tight Region-of-Interest (RoI) segmentation of the ear instead. This paper uses Image-to-Image translation to synthesize ear RoI segmentation and remove irrelevant pixels from input images. Furthermore, missing parts of the ear due to occlusion or distortion can also be synthesized. To accomplish that, we used Pix2Pix Generative Adversarial Network (GAN) trained on the AWE dataset, which is a challenging ear dataset. Experimental results show that using ear RoI segmentation positively affects the classification process, and significantly increases the recognition rate.","PeriodicalId":405638,"journal":{"name":"2020 International Conference on Advanced Aspects of Software Engineering (ICAASE)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Advanced Aspects of Software Engineering (ICAASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAASE51408.2020.9380127","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Most ear recognition techniques use cropped ear images, as they are, with backgrounds, hair, part of the face or neck skin, and even cloths. These non-ear pixels of the image can negatively affect the classification decision. To avoid that, and to make sure that the classifier depends on ear pixels only, we propose using a tight Region-of-Interest (RoI) segmentation of the ear instead. This paper uses Image-to-Image translation to synthesize ear RoI segmentation and remove irrelevant pixels from input images. Furthermore, missing parts of the ear due to occlusion or distortion can also be synthesized. To accomplish that, we used Pix2Pix Generative Adversarial Network (GAN) trained on the AWE dataset, which is a challenging ear dataset. Experimental results show that using ear RoI segmentation positively affects the classification process, and significantly increases the recognition rate.