基于持续学习的指令调整框架（Multi-LoRA），用于通用信息提取

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2024-11-22 DOI:10.1016/j.knosys.2024.112750

Yu Jin, Jie Liu, Shaowei Chen

{"title":"基于持续学习的指令调整框架（Multi-LoRA），用于通用信息提取","authors":"Yu Jin, Jie Liu, Shaowei Chen","doi":"10.1016/j.knosys.2024.112750","DOIUrl":null,"url":null,"abstract":"<div><div>Universal information extraction (Universal IE) aims to develop one model capable of solving multiple IE target tasks. Previous works have enhanced extraction performance of target tasks through auxiliary tasks. However, there are still limitations in terms of learning strategies. From one aspect, joint learning-based universal IE approaches, which simply mix auxiliary tasks with target tasks, fail to enable the model to master basic knowledge from auxiliary tasks before learning target tasks. From another aspect, continual learning-based universal IE approaches, which sequentially update all the model parameters on auxiliary tasks and target tasks, tend to cause catastrophic forgetting. In this study, we design a multi-LoRA continual learning-based instruction fine-tuning framework for universal IE. Specifically, we design unique LoRA modules for learning auxiliary tasks and target tasks. We first freeze pre-trained weights and update additional parameters on auxiliary tasks through one LoRA module. Subsequently, we keep the weights frozen and further adjust parameters through another LoRA module to adapt the model to the target tasks. Finally, we merge the frozen weights with learned weights, thereby enabling the model to better leverage the acquired abilities during the inference phase. Therefore, our model masters basic extraction abilities before learning target tasks and does not forget this basic knowledge during the target learning process. Moreover, we regard extraction, classification, and recognition as basic abilities and further design auxiliary tasks based on these basic abilities. Experimental results on 37 datasets across 3 tasks show that our approach reaches state-of-the-art performance.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"308 ","pages":"Article 112750"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-LoRA continual learning based instruction tuning framework for universal information extraction\",\"authors\":\"Yu Jin, Jie Liu, Shaowei Chen\",\"doi\":\"10.1016/j.knosys.2024.112750\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Universal information extraction (Universal IE) aims to develop one model capable of solving multiple IE target tasks. Previous works have enhanced extraction performance of target tasks through auxiliary tasks. However, there are still limitations in terms of learning strategies. From one aspect, joint learning-based universal IE approaches, which simply mix auxiliary tasks with target tasks, fail to enable the model to master basic knowledge from auxiliary tasks before learning target tasks. From another aspect, continual learning-based universal IE approaches, which sequentially update all the model parameters on auxiliary tasks and target tasks, tend to cause catastrophic forgetting. In this study, we design a multi-LoRA continual learning-based instruction fine-tuning framework for universal IE. Specifically, we design unique LoRA modules for learning auxiliary tasks and target tasks. We first freeze pre-trained weights and update additional parameters on auxiliary tasks through one LoRA module. Subsequently, we keep the weights frozen and further adjust parameters through another LoRA module to adapt the model to the target tasks. Finally, we merge the frozen weights with learned weights, thereby enabling the model to better leverage the acquired abilities during the inference phase. Therefore, our model masters basic extraction abilities before learning target tasks and does not forget this basic knowledge during the target learning process. Moreover, we regard extraction, classification, and recognition as basic abilities and further design auxiliary tasks based on these basic abilities. Experimental results on 37 datasets across 3 tasks show that our approach reaches state-of-the-art performance.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"308 \",\"pages\":\"Article 112750\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705124013844\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013844","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

通用信息提取（Universal IE）旨在开发一种能够解决多种信息提取目标任务的模型。以往的研究通过辅助任务提高了目标任务的提取性能。然而，在学习策略方面仍存在局限性。一方面，基于联合学习的通用 IE 方法只是将辅助任务与目标任务混合在一起，无法让模型在学习目标任务之前掌握辅助任务的基本知识。另一方面，基于持续学习的通用 IE 方法会依次更新辅助任务和目标任务的所有模型参数，容易造成灾难性遗忘。在本研究中，我们为通用智能教育设计了一个基于持续学习的多 LoRA 指令微调框架。具体来说，我们为学习辅助任务和目标任务设计了独特的 LoRA 模块。我们首先冻结预先训练的权重，并通过一个 LoRA 模块更新辅助任务的附加参数。随后，我们保持权重冻结，并通过另一个 LoRA 模块进一步调整参数，使模型适应目标任务。最后，我们将冻结的权重与学习到的权重合并，从而使模型在推理阶段更好地利用已获得的能力。因此，我们的模型在学习目标任务之前就掌握了基本的提取能力，并且在目标学习过程中不会遗忘这些基本知识。此外，我们将提取、分类和识别视为基本能力，并在这些基本能力的基础上进一步设计辅助任务。在 3 个任务的 37 个数据集上的实验结果表明，我们的方法达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-LoRA continual learning based instruction tuning framework for universal information extraction

Universal information extraction (Universal IE) aims to develop one model capable of solving multiple IE target tasks. Previous works have enhanced extraction performance of target tasks through auxiliary tasks. However, there are still limitations in terms of learning strategies. From one aspect, joint learning-based universal IE approaches, which simply mix auxiliary tasks with target tasks, fail to enable the model to master basic knowledge from auxiliary tasks before learning target tasks. From another aspect, continual learning-based universal IE approaches, which sequentially update all the model parameters on auxiliary tasks and target tasks, tend to cause catastrophic forgetting. In this study, we design a multi-LoRA continual learning-based instruction fine-tuning framework for universal IE. Specifically, we design unique LoRA modules for learning auxiliary tasks and target tasks. We first freeze pre-trained weights and update additional parameters on auxiliary tasks through one LoRA module. Subsequently, we keep the weights frozen and further adjust parameters through another LoRA module to adapt the model to the target tasks. Finally, we merge the frozen weights with learned weights, thereby enabling the model to better leverage the acquired abilities during the inference phase. Therefore, our model masters basic extraction abilities before learning target tasks and does not forget this basic knowledge during the target learning process. Moreover, we regard extraction, classification, and recognition as basic abilities and further design auxiliary tasks based on these basic abilities. Experimental results on 37 datasets across 3 tasks show that our approach reaches state-of-the-art performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.