Feature Selection for Stock Movement Direction Prediction Using Sparse Support Vector Machine

IF 1.3 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry Pub Date : 2025-05-19 DOI:10.1002/asmb.70011

Maoxuan Miao, Jinran Wu, Fengjing Cai, Liya Fu, Shurong Zheng, You-Gan Wang

{"title":"Feature Selection for Stock Movement Direction Prediction Using Sparse Support Vector Machine","authors":"Maoxuan Miao, Jinran Wu, Fengjing Cai, Liya Fu, Shurong Zheng, You-Gan Wang","doi":"10.1002/asmb.70011","DOIUrl":null,"url":null,"abstract":"<p>In financial markets, accurate stock price movement prediction can significantly enhance investors' profits. However, the stock price is a highly complex dynamic system with considerable fluctuations, and the accuracy of direction prediction can be improved by selecting appropriate technical indicators. In this work, we propose a novel sparse support vector machines (SVMs) that combines recursive feature elimination (RFE) and ReliefF using a weight parameter. Unlike traditional RFE-based SVMs, our approach constructs a nested feature subset structure, <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mrow>\n <mi>F</mi>\n </mrow>\n <mrow>\n <mn>1</mn>\n </mrow>\n </msub>\n <mo>⊂</mo>\n <msub>\n <mrow>\n <mi>F</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msub>\n <mo>⊂</mo>\n <mi>⋯</mi>\n <mo>⊂</mo>\n <msub>\n <mrow>\n <mi>F</mi>\n </mrow>\n <mrow>\n <mi>p</mi>\n </mrow>\n </msub>\n </mrow>\n <annotation>$$ {F}_1\\subset {F}_2\\subset \\cdots \\subset {F}_p $$</annotation>\n </semantics></math>, using a new filter algorithm that combines backward sacrifice and ReliefF by weighting. This new filter algorithm can capture relevant features and feature interactions simultaneously and is crucial in preventing valuable features from being removed at each iteration. Moreover, the ReliefF algorithm combined with RFE can identify more discriminative feature subsets by reordering the features such that valuable ones are ranked higher than valueless ones, and removing valueless features sequentially through iterative processes. Our experimental results on nine stock datasets from the liquor and spirits concept demonstrate that our proposed method outperforms baseline sparse SVMs and SVM models in terms of accuracy and F-test, while also producing a desirable number of features and automatically eliminating redundancy among technical indicators. We also show that on most stock datasets, the ReliefF algorithm combined with RFE can effectively identify discriminative feature subsets for cases of linear and Gaussian kernel SVMs and our proposed filter method can prevent valuable features from being removed at each iteration. In addition, our experimental findings reveal that feature subsets generated by technical indicators are more discriminative while feature subsets generated by technical indicators subsets mapped to a certain higher dimensional space are less discriminative.</p>","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 3","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70011","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Stochastic Models in Business and Industry","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70011","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

In financial markets, accurate stock price movement prediction can significantly enhance investors' profits. However, the stock price is a highly complex dynamic system with considerable fluctuations, and the accuracy of direction prediction can be improved by selecting appropriate technical indicators. In this work, we propose a novel sparse support vector machines (SVMs) that combines recursive feature elimination (RFE) and ReliefF using a weight parameter. Unlike traditional RFE-based SVMs, our approach constructs a nested feature subset structure, $F_{1} \subset F_{2} \subset \dots \subset F_{p}$ , using a new filter algorithm that combines backward sacrifice and ReliefF by weighting. This new filter algorithm can capture relevant features and feature interactions simultaneously and is crucial in preventing valuable features from being removed at each iteration. Moreover, the ReliefF algorithm combined with RFE can identify more discriminative feature subsets by reordering the features such that valuable ones are ranked higher than valueless ones, and removing valueless features sequentially through iterative processes. Our experimental results on nine stock datasets from the liquor and spirits concept demonstrate that our proposed method outperforms baseline sparse SVMs and SVM models in terms of accuracy and F-test, while also producing a desirable number of features and automatically eliminating redundancy among technical indicators. We also show that on most stock datasets, the ReliefF algorithm combined with RFE can effectively identify discriminative feature subsets for cases of linear and Gaussian kernel SVMs and our proposed filter method can prevent valuable features from being removed at each iteration. In addition, our experimental findings reveal that feature subsets generated by technical indicators are more discriminative while feature subsets generated by technical indicators subsets mapped to a certain higher dimensional space are less discriminative.

查看原文本刊更多论文

基于稀疏支持向量机的股票运动方向预测特征选择

在金融市场中，准确的股价走势预测可以显著提高投资者的利润。但股票价格是一个高度复杂的动态系统，波动较大，通过选择合适的技术指标可以提高方向预测的准确性。在这项工作中，我们提出了一种新的稀疏支持向量机（svm），它结合了递归特征消除（RFE）和使用权重参数的ReliefF。与传统的基于rfe的svm不同，我们的方法构建了一个嵌套的特征子集结构，f1∧f2∧⋯F p $$ {F}_1\subset {F}_2\subset \cdots \subset {F}_p $$，使用一种新的过滤算法，通过加权将向后牺牲和ReliefF结合起来。这种新的过滤算法可以同时捕获相关特征和特征交互，并且在防止有价值的特征在每次迭代中被删除方面至关重要。此外，结合RFE的ReliefF算法通过对特征进行重新排序，使有价值的特征的排名高于无价值的特征，并通过迭代过程依次去除无价值的特征，从而识别出更具判别性的特征子集。我们在白酒和烈酒概念的9个库存数据集上的实验结果表明，我们提出的方法在准确性和f检验方面优于基线稀疏支持向量机和支持向量机模型，同时也产生了理想数量的特征并自动消除了技术指标之间的冗余。我们还表明，在大多数股票数据集上，ReliefF算法结合RFE可以有效地识别线性核支持向量机和高斯核支持向量机的判别特征子集，并且我们提出的滤波方法可以防止有价值的特征在每次迭代中被删除。此外，我们的实验结果表明，由技术指标生成的特征子集具有更强的判别性，而由技术指标子集映射到某个高维空间生成的特征子集的判别性较弱。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Stochastic Models in Business and Industry 数学-数学跨学科应用

CiteScore

2.70

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： ASMBI - Applied Stochastic Models in Business and Industry (formerly Applied Stochastic Models and Data Analysis) was first published in 1985, publishing contributions in the interface between stochastic modelling, data analysis and their applications in business, finance, insurance, management and production. In 2007 ASMBI became the official journal of the International Society for Business and Industrial Statistics (www.isbis.org). The main objective is to publish papers, both technical and practical, presenting new results which solve real-life problems or have great potential in doing so. Mathematical rigour, innovative stochastic modelling and sound applications are the key ingredients of papers to be published, after a very selective review process. The journal is very open to new ideas, like Data Science and Big Data stemming from problems in business and industry or uncertainty quantification in engineering, as well as more traditional ones, like reliability, quality control, design of experiments, managerial processes, supply chains and inventories, insurance, econometrics, financial modelling (provided the papers are related to real problems). The journal is interested also in papers addressing the effects of business and industrial decisions on the environment, healthcare, social life. State-of-the art computational methods are very welcome as well, when combined with sound applications and innovative models.