改进的高维回归模型与矩阵近似应用于支持向量机的比较案例研究

Optimization Methods and Software Pub Date : 2022-02-14 DOI:10.1080/10556788.2021.2022144

M. Roozbeh, S. Babaie-Kafaki, Z. Aminifard

{"title":"改进的高维回归模型与矩阵近似应用于支持向量机的比较案例研究","authors":"M. Roozbeh, S. Babaie-Kafaki, Z. Aminifard","doi":"10.1080/10556788.2021.2022144","DOIUrl":null,"url":null,"abstract":"Nowadays, high-dimensional data appear in many practical applications such as biosciences. In the regression analysis literature, the well-known ordinary least-squares estimation may be misleading when the full ranking of the design matrix is missed. As a popular issue, outliers may corrupt normal distribution of the residuals. Thus, since not being sensitive to the outlying data points, robust estimators are frequently applied in confrontation with the issue. Ill-conditioning in high-dimensional data is another common problem in modern regression analysis under which applying the least-squares estimator is hardly possible. So, it is necessary to deal with estimation methods to tackle these problems. As known, a successful approach for high-dimension cases is the penalized scheme with the aim of obtaining a subset of effective explanatory variables that predict the response as the best, while setting the other parameters to zero. Here, we develop several penalized mixed-integer nonlinear programming models to be used in high-dimension regression analysis. The given matrix approximations have simple structures, decreasing computational cost of the models. Moreover, the models are effectively solvable by metaheuristic algorithms. Numerical tests are made to shed light on performance of the proposed methods on simulated and real world high-dimensional data sets.","PeriodicalId":124811,"journal":{"name":"Optimization Methods and Software","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Improved high-dimensional regression models with matrix approximations applied to the comparative case studies with support vector machines\",\"authors\":\"M. Roozbeh, S. Babaie-Kafaki, Z. Aminifard\",\"doi\":\"10.1080/10556788.2021.2022144\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, high-dimensional data appear in many practical applications such as biosciences. In the regression analysis literature, the well-known ordinary least-squares estimation may be misleading when the full ranking of the design matrix is missed. As a popular issue, outliers may corrupt normal distribution of the residuals. Thus, since not being sensitive to the outlying data points, robust estimators are frequently applied in confrontation with the issue. Ill-conditioning in high-dimensional data is another common problem in modern regression analysis under which applying the least-squares estimator is hardly possible. So, it is necessary to deal with estimation methods to tackle these problems. As known, a successful approach for high-dimension cases is the penalized scheme with the aim of obtaining a subset of effective explanatory variables that predict the response as the best, while setting the other parameters to zero. Here, we develop several penalized mixed-integer nonlinear programming models to be used in high-dimension regression analysis. The given matrix approximations have simple structures, decreasing computational cost of the models. Moreover, the models are effectively solvable by metaheuristic algorithms. Numerical tests are made to shed light on performance of the proposed methods on simulated and real world high-dimensional data sets.\",\"PeriodicalId\":124811,\"journal\":{\"name\":\"Optimization Methods and Software\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optimization Methods and Software\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/10556788.2021.2022144\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optimization Methods and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/10556788.2021.2022144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

如今，高维数据出现在许多实际应用中，如生物科学。在回归分析文献中，当缺少设计矩阵的完整排序时，众所周知的普通最小二乘估计可能会产生误导。异常值会破坏残差的正态分布，这是一个普遍存在的问题。因此，由于对离群数据点不敏感，鲁棒估计器经常被应用于解决这个问题。在现代回归分析中，高维数据的条件不良是另一个常见的问题，在这种情况下，应用最小二乘估计量几乎是不可能的。因此，有必要研究评估方法来解决这些问题。众所周知，对于高维情况，一种成功的方法是惩罚方案，其目的是获得有效解释变量的子集，这些解释变量预测响应为最佳，同时将其他参数设置为零。在这里，我们开发了几个惩罚混合整数非线性规划模型，用于高维回归分析。所给出的矩阵近似结构简单，降低了模型的计算量。此外，该模型可通过元启发式算法有效求解。数值试验揭示了所提出的方法在模拟和现实世界高维数据集上的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improved high-dimensional regression models with matrix approximations applied to the comparative case studies with support vector machines

Nowadays, high-dimensional data appear in many practical applications such as biosciences. In the regression analysis literature, the well-known ordinary least-squares estimation may be misleading when the full ranking of the design matrix is missed. As a popular issue, outliers may corrupt normal distribution of the residuals. Thus, since not being sensitive to the outlying data points, robust estimators are frequently applied in confrontation with the issue. Ill-conditioning in high-dimensional data is another common problem in modern regression analysis under which applying the least-squares estimator is hardly possible. So, it is necessary to deal with estimation methods to tackle these problems. As known, a successful approach for high-dimension cases is the penalized scheme with the aim of obtaining a subset of effective explanatory variables that predict the response as the best, while setting the other parameters to zero. Here, we develop several penalized mixed-integer nonlinear programming models to be used in high-dimension regression analysis. The given matrix approximations have simple structures, decreasing computational cost of the models. Moreover, the models are effectively solvable by metaheuristic algorithms. Numerical tests are made to shed light on performance of the proposed methods on simulated and real world high-dimensional data sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Optimization Methods and Software

自引率

0.00%

发文量