非参数高斯尺度混合误差的惩罚最大似然估计

IF 1.5 3区数学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computational Statistics & Data Analysis Pub Date : 2025-05-16 DOI:10.1016/j.csda.2025.108206

Seo-Young Park , Byungtae Seo

{"title":"非参数高斯尺度混合误差的惩罚最大似然估计","authors":"Seo-Young Park , Byungtae Seo","doi":"10.1016/j.csda.2025.108206","DOIUrl":null,"url":null,"abstract":"<div><div>The penalized least squares and maximum likelihood methods have been successfully employed for simultaneous parameter estimation and variable selection. However, outlying observations can severely affect the quality of the estimator and selection performance. Although some robust methods for variable selection have been proposed in the literature, they often lose substantial efficiency. This is primarily attributed to the excessive dependence on choosing additional tuning parameters or modifying the original objective functions as tools to enhance robustness. In response to these challenges, we use a nonparametric Gaussian scale mixture distribution for the regression error distribution. This approach allows the error distributions in the model to achieve great flexibility and provides data-adaptive robustness. Our proposed estimator exhibits desirable theoretical properties, including sparsity and oracle properties. In the estimation process, we employ a combination of expectation-maximization and gradient-based algorithms for the parametric and nonparametric components, respectively. Through comprehensive numerical studies, encompassing simulation studies and real data analysis, we substantiate the robust performance of the proposed method.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"211 ","pages":"Article 108206"},"PeriodicalIF":1.5000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Penalized maximum likelihood estimation with nonparametric Gaussian scale mixture errors\",\"authors\":\"Seo-Young Park , Byungtae Seo\",\"doi\":\"10.1016/j.csda.2025.108206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The penalized least squares and maximum likelihood methods have been successfully employed for simultaneous parameter estimation and variable selection. However, outlying observations can severely affect the quality of the estimator and selection performance. Although some robust methods for variable selection have been proposed in the literature, they often lose substantial efficiency. This is primarily attributed to the excessive dependence on choosing additional tuning parameters or modifying the original objective functions as tools to enhance robustness. In response to these challenges, we use a nonparametric Gaussian scale mixture distribution for the regression error distribution. This approach allows the error distributions in the model to achieve great flexibility and provides data-adaptive robustness. Our proposed estimator exhibits desirable theoretical properties, including sparsity and oracle properties. In the estimation process, we employ a combination of expectation-maximization and gradient-based algorithms for the parametric and nonparametric components, respectively. Through comprehensive numerical studies, encompassing simulation studies and real data analysis, we substantiate the robust performance of the proposed method.</div></div>\",\"PeriodicalId\":55225,\"journal\":{\"name\":\"Computational Statistics & Data Analysis\",\"volume\":\"211 \",\"pages\":\"Article 108206\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Statistics & Data Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167947325000829\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics & Data Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167947325000829","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

惩罚最小二乘和极大似然方法已成功地用于同时进行参数估计和变量选择。然而，离群观测值会严重影响估计器的质量和选择性能。虽然文献中提出了一些稳健的变量选择方法，但它们往往失去了实质性的效率。这主要是由于过度依赖于选择额外的调优参数或修改原始目标函数作为增强鲁棒性的工具。为了应对这些挑战，我们使用非参数高斯尺度混合分布作为回归误差分布。这种方法使模型中的误差分布具有很大的灵活性，并提供了数据自适应的鲁棒性。我们提出的估计器展示了理想的理论特性，包括稀疏性和oracle特性。在估计过程中，我们分别对参数和非参数分量采用了期望最大化和基于梯度的组合算法。通过全面的数值研究，包括模拟研究和实际数据分析，我们证实了该方法的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Penalized maximum likelihood estimation with nonparametric Gaussian scale mixture errors

The penalized least squares and maximum likelihood methods have been successfully employed for simultaneous parameter estimation and variable selection. However, outlying observations can severely affect the quality of the estimator and selection performance. Although some robust methods for variable selection have been proposed in the literature, they often lose substantial efficiency. This is primarily attributed to the excessive dependence on choosing additional tuning parameters or modifying the original objective functions as tools to enhance robustness. In response to these challenges, we use a nonparametric Gaussian scale mixture distribution for the regression error distribution. This approach allows the error distributions in the model to achieve great flexibility and provides data-adaptive robustness. Our proposed estimator exhibits desirable theoretical properties, including sparsity and oracle properties. In the estimation process, we employ a combination of expectation-maximization and gradient-based algorithms for the parametric and nonparametric components, respectively. Through comprehensive numerical studies, encompassing simulation studies and real data analysis, we substantiate the robust performance of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computational Statistics & Data Analysis 数学-计算机：跨学科应用

CiteScore

3.70

自引率

5.60%

发文量

167

审稿时长

60 days

期刊介绍： Computational Statistics and Data Analysis (CSDA), an Official Publication of the network Computational and Methodological Statistics (CMStatistics) and of the International Association for Statistical Computing (IASC), is an international journal dedicated to the dissemination of methodological research and applications in the areas of computational statistics and data analysis. The journal consists of four refereed sections which are divided into the following subject areas: I) Computational Statistics - Manuscripts dealing with: 1) the explicit impact of computers on statistical methodology (e.g., Bayesian computing, bioinformatics,computer graphics, computer intensive inferential methods, data exploration, data mining, expert systems, heuristics, knowledge based systems, machine learning, neural networks, numerical and optimization methods, parallel computing, statistical databases, statistical systems), and 2) the development, evaluation and validation of statistical software and algorithms. Software and algorithms can be submitted with manuscripts and will be stored together with the online article. II) Statistical Methodology for Data Analysis - Manuscripts dealing with novel and original data analytical strategies and methodologies applied in biostatistics (design and analytic methods for clinical trials, epidemiological studies, statistical genetics, or genetic/environmental interactions), chemometrics, classification, data exploration, density estimation, design of experiments, environmetrics, education, image analysis, marketing, model free data exploration, pattern recognition, psychometrics, statistical physics, image processing, robust procedures. [...] III) Special Applications - [...] IV) Annals of Statistical Data Science [...]