用于 QSAR 研究中正则化和变量选择的稳健混合算法

Christian N. Nwaeme, A. F. Lukman
{"title":"用于 QSAR 研究中正则化和变量选择的稳健混合算法","authors":"Christian N. Nwaeme, A. F. Lukman","doi":"10.46481/jnsps.2023.1708","DOIUrl":null,"url":null,"abstract":"This study introduces a robust hybrid sparse learning approach for regularization and variable selection. This approach comprises two distinct steps. In the initial step, we segment the original dataset into separate training and test sets and standardize the training data using its mean and standard deviation. We then employ either the LASSO or sparse LTS algorithm to analyze the training set, facilitating the selection of variables with non-zero coefficients as essential features for the new dataset. Secondly, the new dataset is divided into training and test sets. The training set is further divided into k folds and evaluated using a combination of Random Forest, Ridge, Lasso, and Support Vector Regression machine learning algorithms. We introduce novel hybrid methods and juxtapose their performance against existing techniques. To validate the efficacy of our proposed methods, we conduct a comprehensive simulation study and apply them to a real-life QSAR analysis. The findings unequivocally demonstrate the superior performance of our proposed estimator, with particular distinction accorded to SLTS+LASSO. In summary, the twostep robust hybrid sparse learning approach offers an effective regularization and variable selection applicable to a wide spectrum of real-world problems.","PeriodicalId":342917,"journal":{"name":"Journal of the Nigerian Society of Physical Sciences","volume":"47 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust hybrid algorithms for regularization and variable selection in QSAR studies\",\"authors\":\"Christian N. Nwaeme, A. F. Lukman\",\"doi\":\"10.46481/jnsps.2023.1708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study introduces a robust hybrid sparse learning approach for regularization and variable selection. This approach comprises two distinct steps. In the initial step, we segment the original dataset into separate training and test sets and standardize the training data using its mean and standard deviation. We then employ either the LASSO or sparse LTS algorithm to analyze the training set, facilitating the selection of variables with non-zero coefficients as essential features for the new dataset. Secondly, the new dataset is divided into training and test sets. The training set is further divided into k folds and evaluated using a combination of Random Forest, Ridge, Lasso, and Support Vector Regression machine learning algorithms. We introduce novel hybrid methods and juxtapose their performance against existing techniques. To validate the efficacy of our proposed methods, we conduct a comprehensive simulation study and apply them to a real-life QSAR analysis. The findings unequivocally demonstrate the superior performance of our proposed estimator, with particular distinction accorded to SLTS+LASSO. In summary, the twostep robust hybrid sparse learning approach offers an effective regularization and variable selection applicable to a wide spectrum of real-world problems.\",\"PeriodicalId\":342917,\"journal\":{\"name\":\"Journal of the Nigerian Society of Physical Sciences\",\"volume\":\"47 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Nigerian Society of Physical Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.46481/jnsps.2023.1708\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Nigerian Society of Physical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46481/jnsps.2023.1708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究介绍了一种用于正则化和变量选择的稳健混合稀疏学习方法。这种方法包括两个不同的步骤。第一步,我们将原始数据集分割成独立的训练集和测试集,并使用平均值和标准差对训练数据进行标准化。然后,我们采用 LASSO 或稀疏 LTS 算法对训练集进行分析,以便选择系数不为零的变量作为新数据集的基本特征。其次,将新数据集分为训练集和测试集。训练集进一步分为 k 个褶皱,并使用随机森林、Ridge、Lasso 和支持向量回归等机器学习算法组合进行评估。我们引入了新颖的混合方法,并将其性能与现有技术进行对比。为了验证我们提出的方法的有效性,我们进行了全面的模拟研究,并将其应用到实际的 QSAR 分析中。研究结果毫不含糊地证明了我们所提出的估计器的卓越性能,其中 SLTS+LASSO 尤为突出。总之,两步稳健混合稀疏学习方法提供了有效的正则化和变量选择,适用于广泛的实际问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Robust hybrid algorithms for regularization and variable selection in QSAR studies
This study introduces a robust hybrid sparse learning approach for regularization and variable selection. This approach comprises two distinct steps. In the initial step, we segment the original dataset into separate training and test sets and standardize the training data using its mean and standard deviation. We then employ either the LASSO or sparse LTS algorithm to analyze the training set, facilitating the selection of variables with non-zero coefficients as essential features for the new dataset. Secondly, the new dataset is divided into training and test sets. The training set is further divided into k folds and evaluated using a combination of Random Forest, Ridge, Lasso, and Support Vector Regression machine learning algorithms. We introduce novel hybrid methods and juxtapose their performance against existing techniques. To validate the efficacy of our proposed methods, we conduct a comprehensive simulation study and apply them to a real-life QSAR analysis. The findings unequivocally demonstrate the superior performance of our proposed estimator, with particular distinction accorded to SLTS+LASSO. In summary, the twostep robust hybrid sparse learning approach offers an effective regularization and variable selection applicable to a wide spectrum of real-world problems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信