Paulino Cristovao, H. Nakada, Y. Tanimura, H. Asoh
{"title":"基于多投影头权重印迹的少镜头模型","authors":"Paulino Cristovao, H. Nakada, Y. Tanimura, H. Asoh","doi":"10.1109/IMCOM53663.2022.9721726","DOIUrl":null,"url":null,"abstract":"Few-shot learning models based on imprinted weights have achieved excellent results on several benchmarks. In these methods, the network model directly sets the weights of the final layers for novel classes from the latent representations of the training classes. As a result, the learned representations lead to good performance accuracy in training classes. However, the performance accuracy may be poor on unseen classes. This paper provides an alternative training technique for imprinted weight models. We find that adding projection heads can yield substantial improvements over the baseline model. Our experiments show that (1) introducing nonlinear projection heads in-between the feature extractor and the classifier substantially improves generalization, (2) imprinting from the task-specific layer does not provide better generalization for novel classes. Instead, we propose imprinting from the task-agnostic layer, and (3) our design choice benefits from a large latent dimension. We validate our findings by achieving 5.6 and 4.1% improvement on the MNIST dataset trained with the Omniglot dataset","PeriodicalId":367038,"journal":{"name":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Few Shot Model based on Weight Imprinting with Multiple Projection Head\",\"authors\":\"Paulino Cristovao, H. Nakada, Y. Tanimura, H. Asoh\",\"doi\":\"10.1109/IMCOM53663.2022.9721726\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Few-shot learning models based on imprinted weights have achieved excellent results on several benchmarks. In these methods, the network model directly sets the weights of the final layers for novel classes from the latent representations of the training classes. As a result, the learned representations lead to good performance accuracy in training classes. However, the performance accuracy may be poor on unseen classes. This paper provides an alternative training technique for imprinted weight models. We find that adding projection heads can yield substantial improvements over the baseline model. Our experiments show that (1) introducing nonlinear projection heads in-between the feature extractor and the classifier substantially improves generalization, (2) imprinting from the task-specific layer does not provide better generalization for novel classes. Instead, we propose imprinting from the task-agnostic layer, and (3) our design choice benefits from a large latent dimension. We validate our findings by achieving 5.6 and 4.1% improvement on the MNIST dataset trained with the Omniglot dataset\",\"PeriodicalId\":367038,\"journal\":{\"name\":\"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IMCOM53663.2022.9721726\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM53663.2022.9721726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Few Shot Model based on Weight Imprinting with Multiple Projection Head
Few-shot learning models based on imprinted weights have achieved excellent results on several benchmarks. In these methods, the network model directly sets the weights of the final layers for novel classes from the latent representations of the training classes. As a result, the learned representations lead to good performance accuracy in training classes. However, the performance accuracy may be poor on unseen classes. This paper provides an alternative training technique for imprinted weight models. We find that adding projection heads can yield substantial improvements over the baseline model. Our experiments show that (1) introducing nonlinear projection heads in-between the feature extractor and the classifier substantially improves generalization, (2) imprinting from the task-specific layer does not provide better generalization for novel classes. Instead, we propose imprinting from the task-agnostic layer, and (3) our design choice benefits from a large latent dimension. We validate our findings by achieving 5.6 and 4.1% improvement on the MNIST dataset trained with the Omniglot dataset