{"title":"具有分组结构的纵向数据的异质性量子回归","authors":"Zhaohan Hou, Lei Wang","doi":"10.1016/j.csda.2024.107928","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>Subgroup analysis for modeling longitudinal data with heterogeneity across all individuals has drawn attention in the modern statistical learning. In this paper, we focus on heterogeneous </span>quantile regression model and propose to achieve variable selection, heterogeneous subgrouping and parameter estimation simultaneously, by using the smoothed generalized estimating equations in conjunction with the multi-directional separation penalty. The proposed method allows individuals to be divided into multiple subgroups for different heterogeneous </span>covariates<span><span> such that estimation efficiency can be gained through incorporating individual correlation structure and sharing information within subgroups. A data-driven procedure based on a modified </span>BIC is applied to estimate the number of subgroups. Theoretical properties of the oracle estimator given the underlying true subpopulation information are firstly provided and then it is shown that the proposed estimator is equivalent to the oracle estimator under some conditions. The finite-sample performance of the proposed estimators is studied through simulations and an application to an AIDS dataset is also presented.</span></p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Heterogeneous quantile regression for longitudinal data with subgroup structures\",\"authors\":\"Zhaohan Hou, Lei Wang\",\"doi\":\"10.1016/j.csda.2024.107928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><span><span>Subgroup analysis for modeling longitudinal data with heterogeneity across all individuals has drawn attention in the modern statistical learning. In this paper, we focus on heterogeneous </span>quantile regression model and propose to achieve variable selection, heterogeneous subgrouping and parameter estimation simultaneously, by using the smoothed generalized estimating equations in conjunction with the multi-directional separation penalty. The proposed method allows individuals to be divided into multiple subgroups for different heterogeneous </span>covariates<span><span> such that estimation efficiency can be gained through incorporating individual correlation structure and sharing information within subgroups. A data-driven procedure based on a modified </span>BIC is applied to estimate the number of subgroups. Theoretical properties of the oracle estimator given the underlying true subpopulation information are firstly provided and then it is shown that the proposed estimator is equivalent to the oracle estimator under some conditions. The finite-sample performance of the proposed estimators is studied through simulations and an application to an AIDS dataset is also presented.</span></p></div>\",\"PeriodicalId\":55225,\"journal\":{\"name\":\"Computational Statistics & Data Analysis\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-01-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Statistics & Data Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167947324000124\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics & Data Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167947324000124","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Heterogeneous quantile regression for longitudinal data with subgroup structures
Subgroup analysis for modeling longitudinal data with heterogeneity across all individuals has drawn attention in the modern statistical learning. In this paper, we focus on heterogeneous quantile regression model and propose to achieve variable selection, heterogeneous subgrouping and parameter estimation simultaneously, by using the smoothed generalized estimating equations in conjunction with the multi-directional separation penalty. The proposed method allows individuals to be divided into multiple subgroups for different heterogeneous covariates such that estimation efficiency can be gained through incorporating individual correlation structure and sharing information within subgroups. A data-driven procedure based on a modified BIC is applied to estimate the number of subgroups. Theoretical properties of the oracle estimator given the underlying true subpopulation information are firstly provided and then it is shown that the proposed estimator is equivalent to the oracle estimator under some conditions. The finite-sample performance of the proposed estimators is studied through simulations and an application to an AIDS dataset is also presented.
期刊介绍:
Computational Statistics and Data Analysis (CSDA), an Official Publication of the network Computational and Methodological Statistics (CMStatistics) and of the International Association for Statistical Computing (IASC), is an international journal dedicated to the dissemination of methodological research and applications in the areas of computational statistics and data analysis. The journal consists of four refereed sections which are divided into the following subject areas:
I) Computational Statistics - Manuscripts dealing with: 1) the explicit impact of computers on statistical methodology (e.g., Bayesian computing, bioinformatics,computer graphics, computer intensive inferential methods, data exploration, data mining, expert systems, heuristics, knowledge based systems, machine learning, neural networks, numerical and optimization methods, parallel computing, statistical databases, statistical systems), and 2) the development, evaluation and validation of statistical software and algorithms. Software and algorithms can be submitted with manuscripts and will be stored together with the online article.
II) Statistical Methodology for Data Analysis - Manuscripts dealing with novel and original data analytical strategies and methodologies applied in biostatistics (design and analytic methods for clinical trials, epidemiological studies, statistical genetics, or genetic/environmental interactions), chemometrics, classification, data exploration, density estimation, design of experiments, environmetrics, education, image analysis, marketing, model free data exploration, pattern recognition, psychometrics, statistical physics, image processing, robust procedures.
[...]
III) Special Applications - [...]
IV) Annals of Statistical Data Science [...]