Maoxuan Miao, Jinran Wu, Fengjing Cai, Liya Fu, Shurong Zheng, You-Gan Wang
{"title":"Feature Selection for Stock Movement Direction Prediction Using Sparse Support Vector Machine","authors":"Maoxuan Miao, Jinran Wu, Fengjing Cai, Liya Fu, Shurong Zheng, You-Gan Wang","doi":"10.1002/asmb.70011","DOIUrl":null,"url":null,"abstract":"<p>In financial markets, accurate stock price movement prediction can significantly enhance investors' profits. However, the stock price is a highly complex dynamic system with considerable fluctuations, and the accuracy of direction prediction can be improved by selecting appropriate technical indicators. In this work, we propose a novel sparse support vector machines (SVMs) that combines recursive feature elimination (RFE) and ReliefF using a weight parameter. Unlike traditional RFE-based SVMs, our approach constructs a nested feature subset structure, <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mrow>\n <mi>F</mi>\n </mrow>\n <mrow>\n <mn>1</mn>\n </mrow>\n </msub>\n <mo>⊂</mo>\n <msub>\n <mrow>\n <mi>F</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msub>\n <mo>⊂</mo>\n <mi>⋯</mi>\n <mo>⊂</mo>\n <msub>\n <mrow>\n <mi>F</mi>\n </mrow>\n <mrow>\n <mi>p</mi>\n </mrow>\n </msub>\n </mrow>\n <annotation>$$ {F}_1\\subset {F}_2\\subset \\cdots \\subset {F}_p $$</annotation>\n </semantics></math>, using a new filter algorithm that combines backward sacrifice and ReliefF by weighting. This new filter algorithm can capture relevant features and feature interactions simultaneously and is crucial in preventing valuable features from being removed at each iteration. Moreover, the ReliefF algorithm combined with RFE can identify more discriminative feature subsets by reordering the features such that valuable ones are ranked higher than valueless ones, and removing valueless features sequentially through iterative processes. Our experimental results on nine stock datasets from the liquor and spirits concept demonstrate that our proposed method outperforms baseline sparse SVMs and SVM models in terms of accuracy and F-test, while also producing a desirable number of features and automatically eliminating redundancy among technical indicators. We also show that on most stock datasets, the ReliefF algorithm combined with RFE can effectively identify discriminative feature subsets for cases of linear and Gaussian kernel SVMs and our proposed filter method can prevent valuable features from being removed at each iteration. In addition, our experimental findings reveal that feature subsets generated by technical indicators are more discriminative while feature subsets generated by technical indicators subsets mapped to a certain higher dimensional space are less discriminative.</p>","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 3","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70011","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Stochastic Models in Business and Industry","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70011","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
In financial markets, accurate stock price movement prediction can significantly enhance investors' profits. However, the stock price is a highly complex dynamic system with considerable fluctuations, and the accuracy of direction prediction can be improved by selecting appropriate technical indicators. In this work, we propose a novel sparse support vector machines (SVMs) that combines recursive feature elimination (RFE) and ReliefF using a weight parameter. Unlike traditional RFE-based SVMs, our approach constructs a nested feature subset structure, , using a new filter algorithm that combines backward sacrifice and ReliefF by weighting. This new filter algorithm can capture relevant features and feature interactions simultaneously and is crucial in preventing valuable features from being removed at each iteration. Moreover, the ReliefF algorithm combined with RFE can identify more discriminative feature subsets by reordering the features such that valuable ones are ranked higher than valueless ones, and removing valueless features sequentially through iterative processes. Our experimental results on nine stock datasets from the liquor and spirits concept demonstrate that our proposed method outperforms baseline sparse SVMs and SVM models in terms of accuracy and F-test, while also producing a desirable number of features and automatically eliminating redundancy among technical indicators. We also show that on most stock datasets, the ReliefF algorithm combined with RFE can effectively identify discriminative feature subsets for cases of linear and Gaussian kernel SVMs and our proposed filter method can prevent valuable features from being removed at each iteration. In addition, our experimental findings reveal that feature subsets generated by technical indicators are more discriminative while feature subsets generated by technical indicators subsets mapped to a certain higher dimensional space are less discriminative.
期刊介绍:
ASMBI - Applied Stochastic Models in Business and Industry (formerly Applied Stochastic Models and Data Analysis) was first published in 1985, publishing contributions in the interface between stochastic modelling, data analysis and their applications in business, finance, insurance, management and production. In 2007 ASMBI became the official journal of the International Society for Business and Industrial Statistics (www.isbis.org). The main objective is to publish papers, both technical and practical, presenting new results which solve real-life problems or have great potential in doing so. Mathematical rigour, innovative stochastic modelling and sound applications are the key ingredients of papers to be published, after a very selective review process.
The journal is very open to new ideas, like Data Science and Big Data stemming from problems in business and industry or uncertainty quantification in engineering, as well as more traditional ones, like reliability, quality control, design of experiments, managerial processes, supply chains and inventories, insurance, econometrics, financial modelling (provided the papers are related to real problems). The journal is interested also in papers addressing the effects of business and industrial decisions on the environment, healthcare, social life. State-of-the art computational methods are very welcome as well, when combined with sound applications and innovative models.