Jianqi Gao , Hang Yu , Yiu-ming Cheung , Jian Cao , Raymond Chi-Wing Wong , Yonggang Zhang
{"title":"通过一致性校准为特定任务嵌入生成塑造预训练语言模型","authors":"Jianqi Gao , Hang Yu , Yiu-ming Cheung , Jian Cao , Raymond Chi-Wing Wong , Yonggang Zhang","doi":"10.1016/j.neunet.2025.107754","DOIUrl":null,"url":null,"abstract":"<div><div>Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific <u>e</u>mbedding <u>g</u>enerat<u>o</u>r. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose <em>co</em>nsistency <em>ca</em>libration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using <strong>8</strong> datasets across <strong>6</strong> task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107754"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Shaping pre-trained language models for task-specific embedding generation via consistency calibration\",\"authors\":\"Jianqi Gao , Hang Yu , Yiu-ming Cheung , Jian Cao , Raymond Chi-Wing Wong , Yonggang Zhang\",\"doi\":\"10.1016/j.neunet.2025.107754\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific <u>e</u>mbedding <u>g</u>enerat<u>o</u>r. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose <em>co</em>nsistency <em>ca</em>libration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using <strong>8</strong> datasets across <strong>6</strong> task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"191 \",\"pages\":\"Article 107754\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025006343\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025006343","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Shaping pre-trained language models for task-specific embedding generation via consistency calibration
Pre-trained language models (PLMs) have shown significant success in various downstream tasks by providing initial parameters for task-specific fine-tuning. An inherent challenge of this approach is that adapting solely to downstream tasks may lead to the forgetting of pre-trained knowledge, resulting in limited fine-tuning performance on downstream tasks. To tackle this challenge, we propose a novel approach called EGO-PLM, where PLMs serve as task-specific embedding generator. The underlying insight of EGO-PLM is to align the fine-tuning tasks for PLMs with those utilized during the pre-training phase. Within this framework, we design a task-agnostic pre-defined task that is similar to the pre-training phase and a task-specific embedding generator to adapt to specific tasks, enabling the specific task can be trained jointly with the pre-defined task. To alleviate task conflicts between pre-defined and task-specific tasks and make sure the generated embedding are task-specific, we propose consistency calibration (CoCa), which aligns the pre-defined objectives with the task-specific ones. Specifically, CoCa identifies inconsistencies between the pre-defined and task-specific objectives in an adversarial manner, subsequently calibrating these disparities through adversarial training. We validate the effectiveness of EGO-PLM using 8 datasets across 6 task categories, demonstrating consistent and substantial improvements compared to state-of-the-art baselines.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.