中文财经文本情绪分析的有效领域自适应微调框架

IF 3.4 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2025-04-29 DOI:10.1007/s10489-025-06578-z

Guofeng Yan, Kuashuai Peng, Yongfeng Wang, Hengliang Tan, Jiao Du, Heng Wu

{"title":"中文财经文本情绪分析的有效领域自适应微调框架","authors":"Guofeng Yan, Kuashuai Peng, Yongfeng Wang, Hengliang Tan, Jiao Du, Heng Wu","doi":"10.1007/s10489-025-06578-z","DOIUrl":null,"url":null,"abstract":"<div>Given the prevalence of pre-trained language models (PLMs) within the field of natural language processing, it has become evident that the conventional two-stage approach of ‘pre-training’-then-‘fine-tuning’ consistently yields commendable outcomes. Nevertheless, most publicly accessible PLMs are pre-trained on extensive, general-purpose datasets, thereby failing to address the substantial domain dissimilarity between the source and target data. This discrepancy has significant implications for the adaptability of PLMs to specific domains. To address this issue, our study proposes AdaFT, an efficient domain-adaptive fine-tuning framework that seeks to enhance the traditional fine-tuning process, thus bridging the gap between the source and target domains. This is particularly beneficial for enabling PLMs to better align with the specialized context of the Chinese financial domain. In contrast to the standard two-stage paradigm, AdaFT incorporates two additional stages: ’multi-task further pre-training’ and ’multi-model parameter fusion.’ In the first phase, the PLM undergoes a rapid, multi-task, parallel learning process, which effectively augments its proficiency in Chinese financial domain-related tasks. In the subsequent stage, we introduce an adaptive multi-model parameter fusion (AdaMFusion) strategy to amalgamate the knowledge acquired from the extended pre-training. To efficiently allocate weights for AdaMFusion, we have developed a local search algorithm with a decreasing step length, i.e., Local Search with Decreasing Step size (LSDS). The combination of AdaMFusion and LSDS algorithm strikes a balance between efficiency and performance, making it suitable for most scenarios. We also find that the optimal weighting factor assigned to a model to be fused is positively correlated with the performance improvement of that model on the target task after further pre-training. We demonstrate that further pre-training is generally effective, and further pre-training on domain-relevant corpora is more effective than on task-relevant corpora. Our extensive experiments, utilizing BERT (Bidirectional Encoder Representations from Transformers) as an illustrative example, indicate that Chinese BERT-base trained under the AdaFT framework attains an accuracy rate of 94.95% in the target task, marking a substantial 3.12% enhancement when compared to the conventional fine-tuning approach. Furthermore, our results demonstrate that AdaFT remains effective when applied to BERT-based variants, such as Chinese ALBERT-base.</div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AdaFT: An efficient domain-adaptive fine-tuning framework for sentiment analysis in chinese financial texts\",\"authors\":\"Guofeng Yan, Kuashuai Peng, Yongfeng Wang, Hengliang Tan, Jiao Du, Heng Wu\",\"doi\":\"10.1007/s10489-025-06578-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>Given the prevalence of pre-trained language models (PLMs) within the field of natural language processing, it has become evident that the conventional two-stage approach of ‘pre-training’-then-‘fine-tuning’ consistently yields commendable outcomes. Nevertheless, most publicly accessible PLMs are pre-trained on extensive, general-purpose datasets, thereby failing to address the substantial domain dissimilarity between the source and target data. This discrepancy has significant implications for the adaptability of PLMs to specific domains. To address this issue, our study proposes AdaFT, an efficient domain-adaptive fine-tuning framework that seeks to enhance the traditional fine-tuning process, thus bridging the gap between the source and target domains. This is particularly beneficial for enabling PLMs to better align with the specialized context of the Chinese financial domain. In contrast to the standard two-stage paradigm, AdaFT incorporates two additional stages: ’multi-task further pre-training’ and ’multi-model parameter fusion.’ In the first phase, the PLM undergoes a rapid, multi-task, parallel learning process, which effectively augments its proficiency in Chinese financial domain-related tasks. In the subsequent stage, we introduce an adaptive multi-model parameter fusion (AdaMFusion) strategy to amalgamate the knowledge acquired from the extended pre-training. To efficiently allocate weights for AdaMFusion, we have developed a local search algorithm with a decreasing step length, i.e., Local Search with Decreasing Step size (LSDS). The combination of AdaMFusion and LSDS algorithm strikes a balance between efficiency and performance, making it suitable for most scenarios. We also find that the optimal weighting factor assigned to a model to be fused is positively correlated with the performance improvement of that model on the target task after further pre-training. We demonstrate that further pre-training is generally effective, and further pre-training on domain-relevant corpora is more effective than on task-relevant corpora. Our extensive experiments, utilizing BERT (Bidirectional Encoder Representations from Transformers) as an illustrative example, indicate that Chinese BERT-base trained under the AdaFT framework attains an accuracy rate of 94.95% in the target task, marking a substantial 3.12% enhancement when compared to the conventional fine-tuning approach. Furthermore, our results demonstrate that AdaFT remains effective when applied to BERT-based variants, such as Chinese ALBERT-base.</div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 10\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06578-z\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06578-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

鉴于预训练语言模型（plm）在自然语言处理领域的普及，很明显，传统的“预训练”-然后“微调”两阶段方法始终产生值得称赞的结果。然而，大多数可公开访问的plm都是在广泛的通用数据集上进行预训练的，因此无法解决源数据和目标数据之间的实质性领域差异。这种差异对plm对特定领域的适应性具有重要意义。为了解决这个问题，我们的研究提出了AdaFT，一个有效的领域自适应微调框架，旨在增强传统的微调过程，从而弥合源领域和目标领域之间的差距。这对于使plm更好地与中国金融领域的专业背景保持一致是特别有益的。与标准的两阶段范式相比，AdaFT包含了两个额外的阶段：“多任务进一步预训练”和“多模型参数融合”。在第一阶段，PLM经历了一个快速、多任务、并行的学习过程，这有效地增强了它对中国金融领域相关任务的熟练程度。在后续阶段，我们引入了一种自适应多模型参数融合（AdaMFusion）策略来合并从扩展预训练中获得的知识。为了有效地为AdaMFusion分配权重，我们开发了一种步长递减的局部搜索算法，即递减步长局部搜索（LSDS）。AdaMFusion和LSDS算法的结合在效率和性能之间取得了平衡，使其适用于大多数场景。我们还发现，分配给待融合模型的最优权重因子与该模型在进一步预训练后在目标任务上的性能改进呈正相关。我们证明了进一步的预训练通常是有效的，并且在领域相关语料库上的进一步预训练比在任务相关语料库上的进一步预训练更有效。我们以BERT （Bidirectional Encoder Representations from Transformers）为例进行了大量的实验，结果表明，在AdaFT框架下训练的中文BERT基在目标任务中达到了94.95%的准确率，与传统的微调方法相比，提高了3.12%。此外，我们的结果表明，当应用于基于bert的变体时，AdaFT仍然有效，例如中文ALBERT-base。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AdaFT: An efficient domain-adaptive fine-tuning framework for sentiment analysis in chinese financial texts

Given the prevalence of pre-trained language models (PLMs) within the field of natural language processing, it has become evident that the conventional two-stage approach of ‘pre-training’-then-‘fine-tuning’ consistently yields commendable outcomes. Nevertheless, most publicly accessible PLMs are pre-trained on extensive, general-purpose datasets, thereby failing to address the substantial domain dissimilarity between the source and target data. This discrepancy has significant implications for the adaptability of PLMs to specific domains. To address this issue, our study proposes AdaFT, an efficient domain-adaptive fine-tuning framework that seeks to enhance the traditional fine-tuning process, thus bridging the gap between the source and target domains. This is particularly beneficial for enabling PLMs to better align with the specialized context of the Chinese financial domain. In contrast to the standard two-stage paradigm, AdaFT incorporates two additional stages: ’multi-task further pre-training’ and ’multi-model parameter fusion.’ In the first phase, the PLM undergoes a rapid, multi-task, parallel learning process, which effectively augments its proficiency in Chinese financial domain-related tasks. In the subsequent stage, we introduce an adaptive multi-model parameter fusion (AdaMFusion) strategy to amalgamate the knowledge acquired from the extended pre-training. To efficiently allocate weights for AdaMFusion, we have developed a local search algorithm with a decreasing step length, i.e., Local Search with Decreasing Step size (LSDS). The combination of AdaMFusion and LSDS algorithm strikes a balance between efficiency and performance, making it suitable for most scenarios. We also find that the optimal weighting factor assigned to a model to be fused is positively correlated with the performance improvement of that model on the target task after further pre-training. We demonstrate that further pre-training is generally effective, and further pre-training on domain-relevant corpora is more effective than on task-relevant corpora. Our extensive experiments, utilizing BERT (Bidirectional Encoder Representations from Transformers) as an illustrative example, indicate that Chinese BERT-base trained under the AdaFT framework attains an accuracy rate of 94.95% in the target task, marking a substantial 3.12% enhancement when compared to the conventional fine-tuning approach. Furthermore, our results demonstrate that AdaFT remains effective when applied to BERT-based variants, such as Chinese ALBERT-base.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.