{"title":"缩放视差网络用于少镜头学习","authors":"Ran Chen , Wen Jiang , Jinbiao Zhu , Jie Geng","doi":"10.1016/j.patcog.2025.112504","DOIUrl":null,"url":null,"abstract":"<div><div>Varying the input image scale allows convolutional networks to extract different features and learn richer image representations. This serves as a form of data augmentation and helps address the few-shot learning challenges. While historical few-shot learning methods have focused on multi-scale feature fusion using techniques such as random resizing or feature pyramids, the exploration of inter-scale feature differences has largely been overlooked. Unlike previous methods, we propose a novel few-shot learning approach, the Scale Parallax Network, which treats images at different resolutions as complementary sources of visual information. We adopt an image-pyramid-based structure to extract multi-scale feature representations and enhance the model representational capacity. Experimental results demonstrate that our method achieves state-of-the-art performance on the miniImageNet and tieredImageNet datasets.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112504"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scale parallax network for few-shot learning\",\"authors\":\"Ran Chen , Wen Jiang , Jinbiao Zhu , Jie Geng\",\"doi\":\"10.1016/j.patcog.2025.112504\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Varying the input image scale allows convolutional networks to extract different features and learn richer image representations. This serves as a form of data augmentation and helps address the few-shot learning challenges. While historical few-shot learning methods have focused on multi-scale feature fusion using techniques such as random resizing or feature pyramids, the exploration of inter-scale feature differences has largely been overlooked. Unlike previous methods, we propose a novel few-shot learning approach, the Scale Parallax Network, which treats images at different resolutions as complementary sources of visual information. We adopt an image-pyramid-based structure to extract multi-scale feature representations and enhance the model representational capacity. Experimental results demonstrate that our method achieves state-of-the-art performance on the miniImageNet and tieredImageNet datasets.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"172 \",\"pages\":\"Article 112504\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325011677\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011677","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Varying the input image scale allows convolutional networks to extract different features and learn richer image representations. This serves as a form of data augmentation and helps address the few-shot learning challenges. While historical few-shot learning methods have focused on multi-scale feature fusion using techniques such as random resizing or feature pyramids, the exploration of inter-scale feature differences has largely been overlooked. Unlike previous methods, we propose a novel few-shot learning approach, the Scale Parallax Network, which treats images at different resolutions as complementary sources of visual information. We adopt an image-pyramid-based structure to extract multi-scale feature representations and enhance the model representational capacity. Experimental results demonstrate that our method achieves state-of-the-art performance on the miniImageNet and tieredImageNet datasets.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.