{"title":"基于BERT和AI反馈的面向方面的情感分类","authors":"Lingling Xu, Weiming Wang","doi":"10.1016/j.nlp.2025.100136","DOIUrl":null,"url":null,"abstract":"<div><div>Data augmentation has been widely employed in low-resource aspect-based sentiment classification (ABSC) tasks to alleviate the issue of data sparsity and enhance the performance of the model. Unlike previous data augmentation approaches that rely on back translation, synonym replacement, or generative language models such as T5, the generation power of large language models is explored rarely. Large language models like GPT-3.5-turbo are trained on extensive datasets and corpus to capture semantic and contextual relationships between words and sentences. To this end, we propose Masked Aspect Term Prediction (MATP), a novel data augmentation method that utilizes the world knowledge and powerful generative capacity of large language models to generate new aspect terms via word masking. By incorporating AI feedback from large language models, MATP increases the diversity and richness of aspect terms. Experimental results on the ABSC datasets with BERT as the backbone model show that the introduction of new augmented datasets leads to significant improvements over baseline models, validating the effectiveness of the proposed data augmentation strategy that combines AI feedback.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100136"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Aspect-based sentiment classification with BERT and AI feedback\",\"authors\":\"Lingling Xu, Weiming Wang\",\"doi\":\"10.1016/j.nlp.2025.100136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Data augmentation has been widely employed in low-resource aspect-based sentiment classification (ABSC) tasks to alleviate the issue of data sparsity and enhance the performance of the model. Unlike previous data augmentation approaches that rely on back translation, synonym replacement, or generative language models such as T5, the generation power of large language models is explored rarely. Large language models like GPT-3.5-turbo are trained on extensive datasets and corpus to capture semantic and contextual relationships between words and sentences. To this end, we propose Masked Aspect Term Prediction (MATP), a novel data augmentation method that utilizes the world knowledge and powerful generative capacity of large language models to generate new aspect terms via word masking. By incorporating AI feedback from large language models, MATP increases the diversity and richness of aspect terms. Experimental results on the ABSC datasets with BERT as the backbone model show that the introduction of new augmented datasets leads to significant improvements over baseline models, validating the effectiveness of the proposed data augmentation strategy that combines AI feedback.</div></div>\",\"PeriodicalId\":100944,\"journal\":{\"name\":\"Natural Language Processing Journal\",\"volume\":\"10 \",\"pages\":\"Article 100136\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Language Processing Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949719125000123\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719125000123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
数据增强被广泛应用于低资源的基于方面的情感分类(ABSC)任务中,以缓解数据稀疏性问题,提高模型的性能。与以前依赖于回译、同义词替换或生成语言模型(如T5)的数据增强方法不同,很少探索大型语言模型的生成能力。像GPT-3.5-turbo这样的大型语言模型在广泛的数据集和语料库上进行训练,以捕获词和句子之间的语义和上下文关系。为此,我们提出了一种新的数据增强方法——掩蔽方面项预测(mask Aspect Term Prediction, MATP),它利用世界知识和大型语言模型强大的生成能力,通过词掩蔽来生成新的方面项。通过整合来自大型语言模型的AI反馈,MATP增加了方面术语的多样性和丰富性。在以BERT为骨干模型的ABSC数据集上的实验结果表明,引入新的增强数据集可以显著改善基线模型,验证了所提出的结合AI反馈的数据增强策略的有效性。
Aspect-based sentiment classification with BERT and AI feedback
Data augmentation has been widely employed in low-resource aspect-based sentiment classification (ABSC) tasks to alleviate the issue of data sparsity and enhance the performance of the model. Unlike previous data augmentation approaches that rely on back translation, synonym replacement, or generative language models such as T5, the generation power of large language models is explored rarely. Large language models like GPT-3.5-turbo are trained on extensive datasets and corpus to capture semantic and contextual relationships between words and sentences. To this end, we propose Masked Aspect Term Prediction (MATP), a novel data augmentation method that utilizes the world knowledge and powerful generative capacity of large language models to generate new aspect terms via word masking. By incorporating AI feedback from large language models, MATP increases the diversity and richness of aspect terms. Experimental results on the ABSC datasets with BERT as the backbone model show that the introduction of new augmented datasets leads to significant improvements over baseline models, validating the effectiveness of the proposed data augmentation strategy that combines AI feedback.