通过提示调整探索通用固有任务子空间,实现快速学习

IF 4.1 2区 计算机科学 Q1 ACOUSTICS
Yujia Qin;Xiaozhi Wang;Yusheng Su;Yankai Lin;Ning Ding;Jing Yi;Weize Chen;Zhiyuan Liu;Juanzi Li;Lei Hou;Peng Li;Maosong Sun;Jie Zhou
{"title":"通过提示调整探索通用固有任务子空间,实现快速学习","authors":"Yujia Qin;Xiaozhi Wang;Yusheng Su;Yankai Lin;Ning Ding;Jing Yi;Weize Chen;Zhiyuan Liu;Juanzi Li;Lei Hou;Peng Li;Maosong Sun;Jie Zhou","doi":"10.1109/TASLP.2024.3430545","DOIUrl":null,"url":null,"abstract":"Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to broad NLP tasks differing a lot superficially? In this work, we empirically find evidence indicating that the adaptations of PLMs to various few-shot tasks can be reparameterized as optimizing only a few free parameters in a unified low-dimensional \n<italic>intrinsic task subspace</i>\n, which may help us understand why PLMs could easily adapt to various NLP tasks with small-scale data. To find such a subspace and examine its universality, we propose an analysis pipeline called \n<italic>intrinsic prompt tuning</i>\n (IPT). Specifically, we resort to the recent success of prompt tuning and decompose the soft prompts of multiple NLP tasks into the same low-dimensional nonlinear subspace, then we learn to adapt the PLM to unseen data or tasks by only tuning parameters in this subspace. In the experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 250-dimensional subspace found with 100 tasks, by only tuning 250 free parameters, we can recover 97% and 83% of the full prompt tuning performance for 100 seen tasks (using different training data) and 20 unseen tasks, respectively, showing great generalization ability of the found intrinsic task subspace. Besides being an analysis tool, IPTcould further help us improve the prompt tuning stability.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"3631-3643"},"PeriodicalIF":4.1000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10603438","citationCount":"0","resultStr":"{\"title\":\"Exploring Universal Intrinsic Task Subspace for Few-Shot Learning via Prompt Tuning\",\"authors\":\"Yujia Qin;Xiaozhi Wang;Yusheng Su;Yankai Lin;Ning Ding;Jing Yi;Weize Chen;Zhiyuan Liu;Juanzi Li;Lei Hou;Peng Li;Maosong Sun;Jie Zhou\",\"doi\":\"10.1109/TASLP.2024.3430545\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to broad NLP tasks differing a lot superficially? In this work, we empirically find evidence indicating that the adaptations of PLMs to various few-shot tasks can be reparameterized as optimizing only a few free parameters in a unified low-dimensional \\n<italic>intrinsic task subspace</i>\\n, which may help us understand why PLMs could easily adapt to various NLP tasks with small-scale data. To find such a subspace and examine its universality, we propose an analysis pipeline called \\n<italic>intrinsic prompt tuning</i>\\n (IPT). Specifically, we resort to the recent success of prompt tuning and decompose the soft prompts of multiple NLP tasks into the same low-dimensional nonlinear subspace, then we learn to adapt the PLM to unseen data or tasks by only tuning parameters in this subspace. In the experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 250-dimensional subspace found with 100 tasks, by only tuning 250 free parameters, we can recover 97% and 83% of the full prompt tuning performance for 100 seen tasks (using different training data) and 20 unseen tasks, respectively, showing great generalization ability of the found intrinsic task subspace. Besides being an analysis tool, IPTcould further help us improve the prompt tuning stability.\",\"PeriodicalId\":13332,\"journal\":{\"name\":\"IEEE/ACM Transactions on Audio, Speech, and Language Processing\",\"volume\":\"32 \",\"pages\":\"3631-3643\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2024-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10603438\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE/ACM Transactions on Audio, Speech, and Language Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10603438/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10603438/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

摘要

为什么预训练语言模型(PLMs)可以学习通用表征并有效适应表面上差异很大的广泛 NLP 任务?在这项工作中,我们通过实证研究发现,PLM 对各种少量任务的适应可以重新参数化为在一个统一的低维内在任务子空间中仅优化几个自由参数,这可能有助于我们理解为什么 PLM 可以轻松地适应各种 NLP 任务的小规模数据。为了找到这样一个子空间并研究其普遍性,我们提出了一个名为 "内在提示调整(IPT)"的分析管道。具体来说,我们借鉴了最近在提示调谐方面取得的成功,将多个 NLP 任务的软提示分解为同一个低维非线性子空间,然后只需在该子空间中调谐参数,就能学会使 PLM 适应未见数据或任务。在实验中,我们研究了不同的少量 NLP 任务,结果令人惊讶地发现,在一个由 100 个任务组成的 250 维子空间中,只需调整 250 个自由参数,我们就能在 100 个已见任务(使用不同的训练数据)和 20 个未见任务中分别恢复 97% 和 83% 的完整提示调整性能,这表明所发现的内在任务子空间具有很强的泛化能力。除了作为一种分析工具,IPTc 还能进一步帮助我们提高即时调谐的稳定性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring Universal Intrinsic Task Subspace for Few-Shot Learning via Prompt Tuning
Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to broad NLP tasks differing a lot superficially? In this work, we empirically find evidence indicating that the adaptations of PLMs to various few-shot tasks can be reparameterized as optimizing only a few free parameters in a unified low-dimensional intrinsic task subspace , which may help us understand why PLMs could easily adapt to various NLP tasks with small-scale data. To find such a subspace and examine its universality, we propose an analysis pipeline called intrinsic prompt tuning (IPT). Specifically, we resort to the recent success of prompt tuning and decompose the soft prompts of multiple NLP tasks into the same low-dimensional nonlinear subspace, then we learn to adapt the PLM to unseen data or tasks by only tuning parameters in this subspace. In the experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 250-dimensional subspace found with 100 tasks, by only tuning 250 free parameters, we can recover 97% and 83% of the full prompt tuning performance for 100 seen tasks (using different training data) and 20 unseen tasks, respectively, showing great generalization ability of the found intrinsic task subspace. Besides being an analysis tool, IPTcould further help us improve the prompt tuning stability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE/ACM Transactions on Audio, Speech, and Language Processing
IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
11.30
自引率
11.10%
发文量
217
期刊介绍: The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信