{"title":"Optimal latent space for Low-shot Face Recognition","authors":"Anvaya Rai, B. Lall, Astha Zalani, Raghawendra Prakash Singh, Shikha Srivastava","doi":"10.1109/IRI58017.2023.00029","DOIUrl":null,"url":null,"abstract":"The ability of humans to learn to classify objects after seeing a very few examples of them in the past, has given rise to the field of Low-shot learning. The idea is to be able to train a deep learning model to differentiate between same and different pairs and then generalise these ideas to evaluate new categories. As shapes, structure and low level visual features of human faces are similar in nature, so we can make use of extensive public face data sets to initially train a deep neural network (DNN), to learn generalised features of human face. We call this face space as Latent Feature Space. Then we demonstrate the use of probablistic interpretation of principal component analysis (PPCA) along with Extreme Learning Machine (ELM) algorithms, as an efficient technique to transform this space for representing our novel dataset classes with limited number of available samples. We avoid any kind of network re-training, while enforcing the network to learn a distance function between images rather than explicitly classifying them. The proposed algorithm couples a deep neural network (DNN) based feature representation with a low-dimensional manifold extraction to address the Low-shot classification and verification problems. We call this low-dimensional subspace as Feature Transformed Latent Space. Also, in addition to providing performance improvements in terms of accuracy, the suggested approach provides significant advantages in terms of memory, computation and speed during classification/ verification tasks, while being agnostic to occlusion, pose, expression and illumination conditions.","PeriodicalId":290818,"journal":{"name":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI58017.2023.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The ability of humans to learn to classify objects after seeing a very few examples of them in the past, has given rise to the field of Low-shot learning. The idea is to be able to train a deep learning model to differentiate between same and different pairs and then generalise these ideas to evaluate new categories. As shapes, structure and low level visual features of human faces are similar in nature, so we can make use of extensive public face data sets to initially train a deep neural network (DNN), to learn generalised features of human face. We call this face space as Latent Feature Space. Then we demonstrate the use of probablistic interpretation of principal component analysis (PPCA) along with Extreme Learning Machine (ELM) algorithms, as an efficient technique to transform this space for representing our novel dataset classes with limited number of available samples. We avoid any kind of network re-training, while enforcing the network to learn a distance function between images rather than explicitly classifying them. The proposed algorithm couples a deep neural network (DNN) based feature representation with a low-dimensional manifold extraction to address the Low-shot classification and verification problems. We call this low-dimensional subspace as Feature Transformed Latent Space. Also, in addition to providing performance improvements in terms of accuracy, the suggested approach provides significant advantages in terms of memory, computation and speed during classification/ verification tasks, while being agnostic to occlusion, pose, expression and illumination conditions.