学习如何转移：持续MRC的终身领域知识蒸馏框架

Intelligent Systems with Applications Pub Date : 2025-03-08 DOI:10.1016/j.iswa.2025.200497

Songze Li , Zhijing Wu , Runmin Cao , Xiaohan Zhang , Yifan Wang , Hua Xu , Kai Gao

{"title":"学习如何转移：持续MRC的终身领域知识蒸馏框架","authors":"Songze Li , Zhijing Wu , Runmin Cao , Xiaohan Zhang , Yifan Wang , Hua Xu , Kai Gao","doi":"10.1016/j.iswa.2025.200497","DOIUrl":null,"url":null,"abstract":"<div><div>Machine Reading Comprehension (MRC) has attracted wide attention in recent years. It can reflect how well a machine understands human language. Benefitting from the increasing large-scale benchmark and pre-trained language models, a lot of MRC models have achieved remarkable success and even exceeded human performance. However, real-world MRC systems need incrementally learn from a continuous data stream across time without accessing the previously seen data, called Continual MRC system. It is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MK-MRC (an extension of MA-MRC), a continual MRC framework with uncertainty-aware fixed <strong>M</strong>emory and lifelong domain <strong>K</strong>nowledge distillation, is proposed. MK-MRC is a memory replaying based method, in which a fixed-size memory buffer stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MK-MRC fully uses the domain adaptation and transfer relationship between memory and new domain data through several domain knowledge distillation strategies.</div><div>Compared with MA-MRC, MK-MRC additionally introduces more strategies to strengthen the ability of continual learning, such as data augmentation and special task-related knowledge distillation. Experimental results show that MK-MRC yields consistent improvement compared with strong baselines and has a substantial incremental learning ability without catastrophically forgetting under four continual span-extractive and multiple-choice MRC settings.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200497"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning how to transfer: A lifelong domain knowledge distillation framework for continual MRC\",\"authors\":\"Songze Li , Zhijing Wu , Runmin Cao , Xiaohan Zhang , Yifan Wang , Hua Xu , Kai Gao\",\"doi\":\"10.1016/j.iswa.2025.200497\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Machine Reading Comprehension (MRC) has attracted wide attention in recent years. It can reflect how well a machine understands human language. Benefitting from the increasing large-scale benchmark and pre-trained language models, a lot of MRC models have achieved remarkable success and even exceeded human performance. However, real-world MRC systems need incrementally learn from a continuous data stream across time without accessing the previously seen data, called Continual MRC system. It is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MK-MRC (an extension of MA-MRC), a continual MRC framework with uncertainty-aware fixed <strong>M</strong>emory and lifelong domain <strong>K</strong>nowledge distillation, is proposed. MK-MRC is a memory replaying based method, in which a fixed-size memory buffer stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MK-MRC fully uses the domain adaptation and transfer relationship between memory and new domain data through several domain knowledge distillation strategies.</div><div>Compared with MA-MRC, MK-MRC additionally introduces more strategies to strengthen the ability of continual learning, such as data augmentation and special task-related knowledge distillation. Experimental results show that MK-MRC yields consistent improvement compared with strong baselines and has a substantial incremental learning ability without catastrophically forgetting under four continual span-extractive and multiple-choice MRC settings.</div></div>\",\"PeriodicalId\":100684,\"journal\":{\"name\":\"Intelligent Systems with Applications\",\"volume\":\"26 \",\"pages\":\"Article 200497\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Systems with Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667305325000237\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667305325000237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

机器阅读理解（MRC）近年来引起了广泛的关注。它可以反映出机器对人类语言的理解程度。得益于越来越多的大规模基准测试和预训练语言模型，许多MRC模型取得了显著的成功，甚至超过了人类的表现。然而，现实世界的MRC系统需要从连续的数据流中逐步学习，而不需要访问以前看到的数据，称为连续MRC系统。在不灾难性地忘记以前的知识的情况下，增量地学习一个新领域是一个巨大的挑战。本文提出了一种具有不确定性感知的固定记忆和终身领域知识蒸馏的连续MRC框架——MK-MRC （MA-MRC的扩展）。MK-MRC是一种基于记忆重放的方法，该方法使用固定大小的内存缓冲区存储先前域数据中的少量样本，并在新域数据到达时采用不确定性感知更新策略。对于增量学习，MK-MRC通过多种领域知识蒸馏策略，充分利用了记忆与新领域数据之间的领域适应和迁移关系。与MA-MRC相比，MK-MRC还引入了更多的策略来增强持续学习的能力，如数据增强和与特殊任务相关的知识蒸馏。实验结果表明，与强基线相比，MK-MRC在四种连续跨度提取和多项选择的MRC设置下取得了一致的改善，并且在没有灾难性遗忘的情况下具有显著的增量学习能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning how to transfer: A lifelong domain knowledge distillation framework for continual MRC

Machine Reading Comprehension (MRC) has attracted wide attention in recent years. It can reflect how well a machine understands human language. Benefitting from the increasing large-scale benchmark and pre-trained language models, a lot of MRC models have achieved remarkable success and even exceeded human performance. However, real-world MRC systems need incrementally learn from a continuous data stream across time without accessing the previously seen data, called Continual MRC system. It is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MK-MRC (an extension of MA-MRC), a continual MRC framework with uncertainty-aware fixed Memory and lifelong domain Knowledge distillation, is proposed. MK-MRC is a memory replaying based method, in which a fixed-size memory buffer stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MK-MRC fully uses the domain adaptation and transfer relationship between memory and new domain data through several domain knowledge distillation strategies.

Compared with MA-MRC, MK-MRC additionally introduces more strategies to strengthen the ability of continual learning, such as data augmentation and special task-related knowledge distillation. Experimental results show that MK-MRC yields consistent improvement compared with strong baselines and has a substantial incremental learning ability without catastrophically forgetting under four continual span-extractive and multiple-choice MRC settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Intelligent Systems with Applications

CiteScore

5.60

自引率

0.00%

发文量