法律推翻任务的神经数据增强：小型深度学习模型与大型语言模型的对比

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters Pub Date : 2024-03-23 DOI:10.1007/s11063-024-11574-4

{"title":"法律推翻任务的神经数据增强：小型深度学习模型与大型语言模型的对比","authors":"","doi":"10.1007/s11063-024-11574-4","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>Deep learning models produce impressive results in any natural language processing applications when given a better learning strategy and trained with large labeled datasets. However, the annotation of massive training data is far too expensive, especially in the legal domain, due to the need for trained legal professionals. Data augmentation solves the problem of learning without labeled big data. In this paper, we employ pre-trained language models and prompt engineering to generate large-scale pseudo-labeled data for the legal overruling task using 100 data samples. We train small recurrent and convolutional deep-learning models using this data and fine-tune a few other transformer models. We then evaluate the effectiveness of the models, both with and without data augmentation, using the benchmark dataset and analyze the results. We also test the performance of these models with the state-of-the-art GPT-3 model under few-shot setting. Our experimental findings demonstrate that data augmentation results in better model performance in the legal overruling task than models trained without augmentation. Furthermore, our best-performing deep learning model trained on augmented data outperforms the few-shot GPT-3 by 18% in the F1-score. Additionally, our results highlight that the small neural networks trained with augmented data achieve outcomes comparable to those of other large language models.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"8 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Neural Data Augmentation for Legal Overruling Task: Small Deep Learning Models vs. Large Language Models\",\"authors\":\"\",\"doi\":\"10.1007/s11063-024-11574-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3>Abstract</h3> <p>Deep learning models produce impressive results in any natural language processing applications when given a better learning strategy and trained with large labeled datasets. However, the annotation of massive training data is far too expensive, especially in the legal domain, due to the need for trained legal professionals. Data augmentation solves the problem of learning without labeled big data. In this paper, we employ pre-trained language models and prompt engineering to generate large-scale pseudo-labeled data for the legal overruling task using 100 data samples. We train small recurrent and convolutional deep-learning models using this data and fine-tune a few other transformer models. We then evaluate the effectiveness of the models, both with and without data augmentation, using the benchmark dataset and analyze the results. We also test the performance of these models with the state-of-the-art GPT-3 model under few-shot setting. Our experimental findings demonstrate that data augmentation results in better model performance in the legal overruling task than models trained without augmentation. Furthermore, our best-performing deep learning model trained on augmented data outperforms the few-shot GPT-3 by 18% in the F1-score. Additionally, our results highlight that the small neural networks trained with augmented data achieve outcomes comparable to those of other large language models.</p>\",\"PeriodicalId\":51144,\"journal\":{\"name\":\"Neural Processing Letters\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Processing Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11063-024-11574-4\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11574-4","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

摘要深度学习模型在任何自然语言处理应用中都能产生令人印象深刻的结果，前提是要有更好的学习策略并使用大量标注数据集进行训练。然而，对海量训练数据进行标注的成本太高，尤其是在法律领域，因为需要训练有素的法律专业人员。数据扩增解决了无标注大数据学习的问题。在本文中，我们采用了预训练语言模型和提示工程，利用 100 个数据样本为法律推翻任务生成大规模伪标注数据。我们利用这些数据训练小型递归和卷积深度学习模型，并对其他几个转换器模型进行微调。然后，我们使用基准数据集评估了这些模型在有数据增强和无数据增强情况下的有效性，并对结果进行了分析。此外，我们还测试了这些模型与最先进的 GPT-3 模型在少点设置下的性能。我们的实验结果表明，与未经增强训练的模型相比，数据增强使模型在法律推翻任务中的表现更好。此外，我们在扩增数据上训练的表现最好的深度学习模型在 F1 分数上要比只有少量数据的 GPT-3 高出 18%。此外，我们的结果突出表明，使用增强数据训练的小型神经网络取得了与其他大型语言模型相当的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Neural Data Augmentation for Legal Overruling Task: Small Deep Learning Models vs. Large Language Models

Abstract

Deep learning models produce impressive results in any natural language processing applications when given a better learning strategy and trained with large labeled datasets. However, the annotation of massive training data is far too expensive, especially in the legal domain, due to the need for trained legal professionals. Data augmentation solves the problem of learning without labeled big data. In this paper, we employ pre-trained language models and prompt engineering to generate large-scale pseudo-labeled data for the legal overruling task using 100 data samples. We train small recurrent and convolutional deep-learning models using this data and fine-tune a few other transformer models. We then evaluate the effectiveness of the models, both with and without data augmentation, using the benchmark dataset and analyze the results. We also test the performance of these models with the state-of-the-art GPT-3 model under few-shot setting. Our experimental findings demonstrate that data augmentation results in better model performance in the legal overruling task than models trained without augmentation. Furthermore, our best-performing deep learning model trained on augmented data outperforms the few-shot GPT-3 by 18% in the F1-score. Additionally, our results highlight that the small neural networks trained with augmented data achieve outcomes comparable to those of other large language models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters