{"title":"Group-guided prompt learning for vision-language models","authors":"Yufei Zheng , Shengsheng Wang , Yansheng Gao","doi":"10.1016/j.eswa.2025.129846","DOIUrl":null,"url":null,"abstract":"<div><div>Prompt learning has become one of the mainstream approaches for enabling Vision-Language Models (VLMs) to effectively adapt to downstream tasks. Recent approaches enhanced the generalization of models by integrating prior knowledge from large language models (LLMs). However, these approaches overlook the potential value of group knowledge derived from semantic correlations across different classes, which may limit the performance of the model in the face of complex downstream tasks. To overcome this challenge, we propose <strong>Group-guided Prompt Learning (GGPL)</strong>, which integrates group knowledge into the original text prompts through LLMs. Specifically, GGPL uses LLMs to group all classes and integrates the group knowledge into the original text prompts to construct the final text prompts. Furthermore, we introduce a novel <strong>Group Knowledge Alignment (GKA)</strong> module, which aligns the learnable prompt features with the pre-trained features that contain group knowledge, preventing the learnable prompt features from feature shift during the training process and thus reducing overfitting. Experimental results across 11 public datasets demonstrate that the proposed GGPL method achieves significant improvement on various prompt learning approaches, while numerous ablation experiments also demonstrate the effectiveness of the each component of our GGPL method.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129846"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095741742503461X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Prompt learning has become one of the mainstream approaches for enabling Vision-Language Models (VLMs) to effectively adapt to downstream tasks. Recent approaches enhanced the generalization of models by integrating prior knowledge from large language models (LLMs). However, these approaches overlook the potential value of group knowledge derived from semantic correlations across different classes, which may limit the performance of the model in the face of complex downstream tasks. To overcome this challenge, we propose Group-guided Prompt Learning (GGPL), which integrates group knowledge into the original text prompts through LLMs. Specifically, GGPL uses LLMs to group all classes and integrates the group knowledge into the original text prompts to construct the final text prompts. Furthermore, we introduce a novel Group Knowledge Alignment (GKA) module, which aligns the learnable prompt features with the pre-trained features that contain group knowledge, preventing the learnable prompt features from feature shift during the training process and thus reducing overfitting. Experimental results across 11 public datasets demonstrate that the proposed GGPL method achieves significant improvement on various prompt learning approaches, while numerous ablation experiments also demonstrate the effectiveness of the each component of our GGPL method.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.