{"title":"Pursuing sparsity and homogeneity for multi-source high-dimensional current status data","authors":"Xin Ye , Yanyan Liu","doi":"10.1016/j.jspi.2025.106293","DOIUrl":null,"url":null,"abstract":"<div><div>Nowadays, current status data with high-dimensional predictors are prevalent in observational studies. However, for a single study, the high dimensionality and the presence of censoring pose substantial challenges to statistical analysis with limited sample size. Although integrative analysis has been widely regarded as an effective strategy to improve the estimation, the source-level heterogeneity has to be carefully addressed. In this paper, we propose an integrative analysis method for multi-source high-dimensional current status data, which can simultaneously identify the homogeneity/heterogeneity structure and select important variables. We prove that the proposed approach attains consistency in estimation, sparsity recovery, and the pursuit of homogeneity. Extensive simulation studies have been carried out to assess the finite sample performance of the proposed method. A real data analysis of multi-source ovarian cancer recurrence studies further demonstrates its practical applicability.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106293"},"PeriodicalIF":0.8000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Planning and Inference","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S037837582500031X","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays, current status data with high-dimensional predictors are prevalent in observational studies. However, for a single study, the high dimensionality and the presence of censoring pose substantial challenges to statistical analysis with limited sample size. Although integrative analysis has been widely regarded as an effective strategy to improve the estimation, the source-level heterogeneity has to be carefully addressed. In this paper, we propose an integrative analysis method for multi-source high-dimensional current status data, which can simultaneously identify the homogeneity/heterogeneity structure and select important variables. We prove that the proposed approach attains consistency in estimation, sparsity recovery, and the pursuit of homogeneity. Extensive simulation studies have been carried out to assess the finite sample performance of the proposed method. A real data analysis of multi-source ovarian cancer recurrence studies further demonstrates its practical applicability.
期刊介绍:
The Journal of Statistical Planning and Inference offers itself as a multifaceted and all-inclusive bridge between classical aspects of statistics and probability, and the emerging interdisciplinary aspects that have a potential of revolutionizing the subject. While we maintain our traditional strength in statistical inference, design, classical probability, and large sample methods, we also have a far more inclusive and broadened scope to keep up with the new problems that confront us as statisticians, mathematicians, and scientists.
We publish high quality articles in all branches of statistics, probability, discrete mathematics, machine learning, and bioinformatics. We also especially welcome well written and up to date review articles on fundamental themes of statistics, probability, machine learning, and general biostatistics. Thoughtful letters to the editors, interesting problems in need of a solution, and short notes carrying an element of elegance or beauty are equally welcome.