Pratiyush Guleria , Jaroslav Frnda , Parvathaneni Naga Srinivasu
{"title":"基于NLP的文本分类使用TF-IDF启用微调长短期记忆:实证分析","authors":"Pratiyush Guleria , Jaroslav Frnda , Parvathaneni Naga Srinivasu","doi":"10.1016/j.array.2025.100467","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid proliferation of information through digital transformation and the widespread use of social networking platforms has significantly increased the speed of information dissemination across urban and rural areas alike. While these platforms have become vital channels for sharing news, advertisements, and crucial updates, they also pose challenges in verifying the authenticity of the information in real-time. Addressing this issue, this study proposes a novel Convolutional Neural Networks (CNN)-Long Short-Term Memory (LSTM) model designed for the classification of fake news articles. A comprehensive dataset covering diverse categories, including government news, Middle East news, US news, left-wing news, and political content, was utilized in this research. Following preprocessing, features were extracted using the Term Frequency-Inverse Document Frequency (TF-IDF) technique, and word embeddings were generated for enhanced semantic representation. The combined CNN-LSTM model leverages the strengths of both architectures, capturing local patterns and long-range dependencies within the data. The experimental results demonstrate that the Fine-Tuned CNN-LSTM model outperforms all precedent approaches across various categories. Notably, the Fine-Tuned CNN-LSTM model achieves the highest accuracy (AC), ranging from 0.57 to 0.68, highlighting its superior classification performance to other precedent approaches, indicating their inefficacy in handling multiple categories.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"27 ","pages":"Article 100467"},"PeriodicalIF":4.5000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NLP based text classification using TF-IDF enabled fine-tuned long short-term memory: An empirical analysis\",\"authors\":\"Pratiyush Guleria , Jaroslav Frnda , Parvathaneni Naga Srinivasu\",\"doi\":\"10.1016/j.array.2025.100467\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid proliferation of information through digital transformation and the widespread use of social networking platforms has significantly increased the speed of information dissemination across urban and rural areas alike. While these platforms have become vital channels for sharing news, advertisements, and crucial updates, they also pose challenges in verifying the authenticity of the information in real-time. Addressing this issue, this study proposes a novel Convolutional Neural Networks (CNN)-Long Short-Term Memory (LSTM) model designed for the classification of fake news articles. A comprehensive dataset covering diverse categories, including government news, Middle East news, US news, left-wing news, and political content, was utilized in this research. Following preprocessing, features were extracted using the Term Frequency-Inverse Document Frequency (TF-IDF) technique, and word embeddings were generated for enhanced semantic representation. The combined CNN-LSTM model leverages the strengths of both architectures, capturing local patterns and long-range dependencies within the data. The experimental results demonstrate that the Fine-Tuned CNN-LSTM model outperforms all precedent approaches across various categories. Notably, the Fine-Tuned CNN-LSTM model achieves the highest accuracy (AC), ranging from 0.57 to 0.68, highlighting its superior classification performance to other precedent approaches, indicating their inefficacy in handling multiple categories.</div></div>\",\"PeriodicalId\":8417,\"journal\":{\"name\":\"Array\",\"volume\":\"27 \",\"pages\":\"Article 100467\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Array\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590005625000943\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625000943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
NLP based text classification using TF-IDF enabled fine-tuned long short-term memory: An empirical analysis
The rapid proliferation of information through digital transformation and the widespread use of social networking platforms has significantly increased the speed of information dissemination across urban and rural areas alike. While these platforms have become vital channels for sharing news, advertisements, and crucial updates, they also pose challenges in verifying the authenticity of the information in real-time. Addressing this issue, this study proposes a novel Convolutional Neural Networks (CNN)-Long Short-Term Memory (LSTM) model designed for the classification of fake news articles. A comprehensive dataset covering diverse categories, including government news, Middle East news, US news, left-wing news, and political content, was utilized in this research. Following preprocessing, features were extracted using the Term Frequency-Inverse Document Frequency (TF-IDF) technique, and word embeddings were generated for enhanced semantic representation. The combined CNN-LSTM model leverages the strengths of both architectures, capturing local patterns and long-range dependencies within the data. The experimental results demonstrate that the Fine-Tuned CNN-LSTM model outperforms all precedent approaches across various categories. Notably, the Fine-Tuned CNN-LSTM model achieves the highest accuracy (AC), ranging from 0.57 to 0.68, highlighting its superior classification performance to other precedent approaches, indicating their inefficacy in handling multiple categories.