通过一致性校准为特定任务嵌入生成塑造预训练语言模型

IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jianqi Gao , Hang Yu , Yiu-ming Cheung , Jian Cao , Raymond Chi-Wing Wong , Yonggang Zhang
{"title":"通过一致性校准为特定任务嵌入生成塑造预训练语言模型","authors":"Jianqi Gao ,&nbsp;Hang Yu ,&nbsp;Yiu-ming Cheung ,&nbsp;Jian Cao ,&nbsp;Raymond Chi-Wing Wong ,&nbsp;Yonggang Zhang","doi":"10.1016/j.neunet.2025.107754","DOIUrl":null,"url":null,"abstract":"<div><div>Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific <u>e</u>mbedding <u>g</u>enerat<u>o</u>r. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose <em>co</em>nsistency <em>ca</em>libration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using <strong>8</strong> datasets across <strong>6</strong> task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107754"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Shaping pre-trained language models for task-specific embedding generation via consistency calibration\",\"authors\":\"Jianqi Gao ,&nbsp;Hang Yu ,&nbsp;Yiu-ming Cheung ,&nbsp;Jian Cao ,&nbsp;Raymond Chi-Wing Wong ,&nbsp;Yonggang Zhang\",\"doi\":\"10.1016/j.neunet.2025.107754\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific <u>e</u>mbedding <u>g</u>enerat<u>o</u>r. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose <em>co</em>nsistency <em>ca</em>libration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using <strong>8</strong> datasets across <strong>6</strong> task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"191 \",\"pages\":\"Article 107754\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025006343\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025006343","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

预训练语言模型(PLMs)通过为特定于任务的微调提供初始参数,在各种下游任务中取得了显著的成功。这种方法的一个固有挑战是,仅仅适应下游任务可能会导致忘记预先训练的知识,从而导致下游任务的微调性能受到限制。为了应对这一挑战,我们提出了一种名为EGO-PLM的新方法,其中plm作为特定任务的嵌入生成器。EGO-PLM的基本见解是将plm的微调任务与预训练阶段使用的任务保持一致。在此框架内,我们设计了一个类似于预训练阶段的任务不可知的预定义任务和一个适应特定任务的任务特定嵌入生成器,使特定任务可以与预定义任务联合训练。为了缓解预定义任务和特定任务之间的冲突,并确保生成的嵌入是特定于任务的,我们提出了一致性校准(CoCa),将预定义目标与特定任务的目标保持一致。具体来说,CoCa以对抗的方式识别预定义目标和特定任务目标之间的不一致性,随后通过对抗训练校准这些差异。我们使用跨6个任务类别的8个数据集验证EGO-PLM的有效性,与最先进的基线相比,显示出一致和实质性的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Shaping pre-trained language models for task-specific embedding generation via consistency calibration
Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific embedding generator. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose consistency calibration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using 8 datasets across 6 task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信