Neural Prompt Search.

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-07-30 DOI:10.1109/TPAMI.2024.3435939

Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu

{"title":"Neural Prompt Search.","authors":"Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu","doi":"10.1109/TPAMI.2024.3435939","DOIUrl":null,"url":null,"abstract":"<p><p>The size of vision models has grown exponentially over the last few years, especially after the emergence of Vision Transformer. This has motivated the development of parameter-efficient tuning methods, such as learning adapter layers or visual prompt tokens, which allow a tiny portion of model parameters to be trained whereas the vast majority obtained from pre-training are frozen. However, designing a proper tuning method is non-trivial: one might need to try out a lengthy list of design choices, not to mention that each downstream dataset often requires custom designs. In this paper, we view the existing parameter-efficient tuning methods as \"prompt modules\" and propose Neural prOmpt seArcH (NOAH), a novel approach that learns, for large vision models, the optimal design of prompt modules through a neural architecture search algorithm, specifically for each downstream dataset. By conducting extensive experiments on over 20 vision datasets, we demonstrate that NOAH (i) is superior to individual prompt modules, (ii) has good few-shot learning ability, and (iii) is domain-generalizable. The code and models are available at https://github.com/ZhangYuanhan-AI/NOAH.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2024.3435939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The size of vision models has grown exponentially over the last few years, especially after the emergence of Vision Transformer. This has motivated the development of parameter-efficient tuning methods, such as learning adapter layers or visual prompt tokens, which allow a tiny portion of model parameters to be trained whereas the vast majority obtained from pre-training are frozen. However, designing a proper tuning method is non-trivial: one might need to try out a lengthy list of design choices, not to mention that each downstream dataset often requires custom designs. In this paper, we view the existing parameter-efficient tuning methods as "prompt modules" and propose Neural prOmpt seArcH (NOAH), a novel approach that learns, for large vision models, the optimal design of prompt modules through a neural architecture search algorithm, specifically for each downstream dataset. By conducting extensive experiments on over 20 vision datasets, we demonstrate that NOAH (i) is superior to individual prompt modules, (ii) has good few-shot learning ability, and (iii) is domain-generalizable. The code and models are available at https://github.com/ZhangYuanhan-AI/NOAH.

查看原文本刊更多论文

神经提示搜索

在过去几年中，视觉模型的规模呈指数级增长，尤其是在视觉转换器（Vision Transformer）出现之后。这推动了参数效率调整方法的发展，例如学习适配器层或视觉提示标记，这些方法允许对极小部分模型参数进行训练，而通过预训练获得的绝大部分参数则被冻结。然而，设计一种合适的调整方法并非易事：我们可能需要尝试一长串的设计选择，更不用说每个下游数据集通常都需要定制设计。在本文中，我们将现有的参数高效调整方法视为 "提示模块"，并提出了神经提示模块（NOAH），这是一种新颖的方法，通过神经架构搜索算法学习大型视觉模型的最佳提示模块设计，特别适用于每个下游数据集。通过在 20 多个视觉数据集上进行广泛实验，我们证明了 NOAH (i) 优于单个提示模块，(ii) 具有良好的少量学习能力，(iii) 具有领域通用性。代码和模型可在 https://github.com/ZhangYuanhan-AI/NOAH 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量

文献相关原料

公司名称	产品信息	采购帮参考价格