{"title":"Named Entity Recognition Based Neural Network Framework for Stock Trend Prediction Using Latent Dirichlet Allocation","authors":"Manas Ranjan Prusty, Apoorv Kumar Sinha, Sanskriti Sanjay Kumar Singh, Shreyas Sai, Vijayakumar Kedalu Poornachary, Subhra Rani Patra","doi":"10.1007/s13369-025-10090-4","DOIUrl":null,"url":null,"abstract":"<div><p>Stock price prediction is an extensively researched topic as the precise prophecy of stock trends is decisive in the investment marketing sphere. With increasing opinions by many market giants on the internet about given stocks, it surges the necessity to study these sentiments in detail for forthcoming predictions. From these articles on the internet, natural text is generated by examining factors that affect the values of stocks and therefore these texts are reliable features to go ahead with this study. The idea behind tackling such work is that conglomerates and businesses are able to tangibly understand the aftermath of articles that usually mobilize public opinion and gear them in a certain direction. The aim of this study is to utilize named entity recognition (NER) on a neural network framework for stock trend prediction through latent Dirichlet allocation using these natural texts generated from internet articles. This method is used to understand the words that occur at the highest frequency and add the most information to the corpus depending on the topic’s importance. With this, the model adopts K × K words that have the most decisive impact on the target class that has been created with which it alters the sparse density matrix that has been generated. The proposed model of the NER-based neural network was fitted on a real-world dataset, and its performance was good in comparison with state-of-the-art models developed by fellow researchers. However, since the model does not use the BERT tokenizers, it cannot be adjudged on the FinBERT model, and therefore, the preprocessed data is fed to a pruned recurrent neural network which is robustly stopped with a simple callback function. The final result was a strong 0.81 tetrachoric correlation between the testing target class and the predicted target class. With this, the model provides a different approach to natural language processing, especially those with high sparse density for stock prediction.</p></div>","PeriodicalId":54354,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"50 19","pages":"16135 - 16148"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://link.springer.com/article/10.1007/s13369-025-10090-4","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Stock price prediction is an extensively researched topic as the precise prophecy of stock trends is decisive in the investment marketing sphere. With increasing opinions by many market giants on the internet about given stocks, it surges the necessity to study these sentiments in detail for forthcoming predictions. From these articles on the internet, natural text is generated by examining factors that affect the values of stocks and therefore these texts are reliable features to go ahead with this study. The idea behind tackling such work is that conglomerates and businesses are able to tangibly understand the aftermath of articles that usually mobilize public opinion and gear them in a certain direction. The aim of this study is to utilize named entity recognition (NER) on a neural network framework for stock trend prediction through latent Dirichlet allocation using these natural texts generated from internet articles. This method is used to understand the words that occur at the highest frequency and add the most information to the corpus depending on the topic’s importance. With this, the model adopts K × K words that have the most decisive impact on the target class that has been created with which it alters the sparse density matrix that has been generated. The proposed model of the NER-based neural network was fitted on a real-world dataset, and its performance was good in comparison with state-of-the-art models developed by fellow researchers. However, since the model does not use the BERT tokenizers, it cannot be adjudged on the FinBERT model, and therefore, the preprocessed data is fed to a pruned recurrent neural network which is robustly stopped with a simple callback function. The final result was a strong 0.81 tetrachoric correlation between the testing target class and the predicted target class. With this, the model provides a different approach to natural language processing, especially those with high sparse density for stock prediction.
期刊介绍:
King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE).
AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.