基于贝尔特模型和递归神经网络（RNN_Bert_Based）的整合提高文本分类的准确性和有效性

Q1 Mathematics

Applied Sciences Pub Date : 2024-09-18 DOI:10.3390/app14188388

Chanthol Eang, Seungjae Lee

{"title":"基于贝尔特模型和递归神经网络（RNN_Bert_Based）的整合提高文本分类的准确性和有效性","authors":"Chanthol Eang, Seungjae Lee","doi":"10.3390/app14188388","DOIUrl":null,"url":null,"abstract":"This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert’s bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy.","PeriodicalId":8224,"journal":{"name":"Applied Sciences","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)\",\"authors\":\"Chanthol Eang, Seungjae Lee\",\"doi\":\"10.3390/app14188388\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert’s bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy.\",\"PeriodicalId\":8224,\"journal\":{\"name\":\"Applied Sciences\",\"volume\":\"4 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/app14188388\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/app14188388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 0

摘要

本文针对斯坦福情感树库 v2（SST-2）数据集的模型准确性，提出了一种新的稳健文本分类模型。我们开发了一个基于循环神经网络伯特（RNN_Bert_based）的模型，旨在提高 SST-2 数据集的分类准确率。该数据集由电影评论句子组成，每个句子都标有正面或负面情感，因此是一项二元分类任务。递归神经网络（RNN）对文本分类非常有效，因为它们能捕捉语言的顺序性，这对理解上下文和含义至关重要。Bert 通过提供双向上下文、生成上下文嵌入以及利用大型语料库进行预训练，在文本分类方面表现出色。这使得 Bert 能够有效捕捉文本中细微的含义和关系。将 Bert 与 RNNs 相结合，可以非常有效地进行文本分类。Bert 的双向上下文和丰富的嵌入提供了对文本的深刻理解，而 RNN 则能捕捉顺序模式和长距离依赖关系。它们共同利用了两种架构的优势，从而提高了复杂分类任务的性能。接下来，我们还开发了一种 Bert 模型与基于 KNN_Bert_based（KNN_Bert_based）的近邻方法的集成，作为我们提出的工作的比较方案。根据实验结果，我们提出的模型在准确率方面优于传统文本分类模型和现有模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)

This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert’s bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Sciences Mathematics-Applied Mathematics

CiteScore

6.40

自引率

0.00%

发文量

审稿时长

11 weeks

期刊介绍： APPS is an international journal. APPS covers a wide spectrum of pure and applied mathematics in science and technology, promoting especially papers presented at Carpato-Balkan meetings. The Editorial Board of APPS takes a very active role in selecting and refereeing papers, ensuring the best quality of contemporary mathematics and its applications. APPS is abstracted in Zentralblatt für Mathematik. The APPS journal uses Double blind peer review.