An iterative matrix uncertainty selector for high-dimensional generalized linear models with measurement errors.

IF 1.9 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research Pub Date : 2025-06-01 Epub Date: 2025-03-19 DOI:10.1177/09622802251316963

Betrand Fesuh Nono, Georges Nguefack-Tsague, Martin Kegnenlezom, Eugène-Patrice N Nguéma

{"title":"An iterative matrix uncertainty selector for high-dimensional generalized linear models with measurement errors.","authors":"Betrand Fesuh Nono, Georges Nguefack-Tsague, Martin Kegnenlezom, Eugène-Patrice N Nguéma","doi":"10.1177/09622802251316963","DOIUrl":null,"url":null,"abstract":"<p><p>Measurement error is a prevalent issue in high-dimensional generalized linear regression that existing regularization techniques may inadequately address. Most require estimating error distributions, which can be computationally prohibitive or unrealistic. We introduce an error distribution-free approach for variable selection called the Iterative Matrix Uncertainty Selector (IMUS). IMUS employs the matrix uncertainty selector framework for linear models, which is known for its selection consistency properties. It features an efficient iterative algorithm easily implemented for any generalized linear model within the exponential family. Empirically, we demonstrate that IMUS performs well in simulations and on three microarray gene expression datasets, achieving effective covariate selection with smoother convergence and clearer elbow criteria compared to other error distribution free methods. Notably, simulation studies in logistic and Poisson regression showed that IMUS exhibited smoother convergence and clearer elbow criteria, performing comparably to the Generalized Matrix Uncertainty Selector (GMUS) and Generalized Matrix Uncertainty Lasso (GMUL) in covariate selection. In many scenarios, IMUS had smaller estimation errors than GMUL and GMUS, measured by both the 1- and 2-norms. In applications to three microarray datasets with noisy measurements, GMUS faced convergence issues, while GMUL converged but lacked well-defined elbows for two datasets. In contrast, IMUS converged with well-defined elbows for all datasets, providing a potentially effective solution for high dimensional regression problems involving measurement errors.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1114-1129"},"PeriodicalIF":1.9000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Methods in Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/09622802251316963","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/19 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Measurement error is a prevalent issue in high-dimensional generalized linear regression that existing regularization techniques may inadequately address. Most require estimating error distributions, which can be computationally prohibitive or unrealistic. We introduce an error distribution-free approach for variable selection called the Iterative Matrix Uncertainty Selector (IMUS). IMUS employs the matrix uncertainty selector framework for linear models, which is known for its selection consistency properties. It features an efficient iterative algorithm easily implemented for any generalized linear model within the exponential family. Empirically, we demonstrate that IMUS performs well in simulations and on three microarray gene expression datasets, achieving effective covariate selection with smoother convergence and clearer elbow criteria compared to other error distribution free methods. Notably, simulation studies in logistic and Poisson regression showed that IMUS exhibited smoother convergence and clearer elbow criteria, performing comparably to the Generalized Matrix Uncertainty Selector (GMUS) and Generalized Matrix Uncertainty Lasso (GMUL) in covariate selection. In many scenarios, IMUS had smaller estimation errors than GMUL and GMUS, measured by both the 1- and 2-norms. In applications to three microarray datasets with noisy measurements, GMUS faced convergence issues, while GMUL converged but lacked well-defined elbows for two datasets. In contrast, IMUS converged with well-defined elbows for all datasets, providing a potentially effective solution for high dimensional regression problems involving measurement errors.

查看原文本刊更多论文

具有测量误差的高维广义线性模型的迭代矩阵不确定性选择器。

测量误差是高维广义线性回归中普遍存在的问题，现有的正则化技术可能无法充分解决。大多数都需要估计误差分布，这在计算上是不允许的或不现实的。我们引入了一种误差无分布的变量选择方法，称为迭代矩阵不确定性选择器（IMUS）。IMUS对线性模型采用矩阵不确定性选择器框架，以其选择一致性而闻名。它的特点是一个有效的迭代算法，易于实现任何广义线性模型在指数族。经验表明，IMUS在模拟和三个微阵列基因表达数据集上表现良好，与其他无误差分布的方法相比，实现了有效的协变量选择，收敛更平滑，肘部标准更清晰。值得注意的是，逻辑回归和泊松回归的模拟研究表明，与广义矩阵不确定性选择器（GMUS）和广义矩阵不确定性套索（GMUL）相比，IMUS在协变量选择方面具有更平滑的收敛性和更清晰的肘部准则。在许多情况下，IMUS比GMUL和GMUS具有更小的估计误差，通过1和2规范测量。在三个带有噪声测量的微阵列数据集的应用中，GMUS面临收敛问题，而GMUL在两个数据集上收敛但缺乏明确的肘部。相比之下，IMUS收敛于所有数据集的定义良好的弯头，为涉及测量误差的高维回归问题提供了潜在的有效解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistical Methods in Medical Research 医学-数学与计算生物学

CiteScore

4.10

自引率

4.30%

发文量

127

审稿时长

>12 weeks

期刊介绍： Statistical Methods in Medical Research is a peer reviewed scholarly journal and is the leading vehicle for articles in all the main areas of medical statistics and an essential reference for all medical statisticians. This unique journal is devoted solely to statistics and medicine and aims to keep professionals abreast of the many powerful statistical techniques now available to the medical profession. This journal is a member of the Committee on Publication Ethics (COPE)