{"title":"On Construction of Nonregular Two-Level Factorial Designs With Maximum Generalized Resolutions","authors":"Chenlu Shi, Boxin Tang","doi":"10.5705/ss.202021.0024","DOIUrl":"https://doi.org/10.5705/ss.202021.0024","url":null,"abstract":"","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135182934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonparametric Bayesian Two-Level Clustering for Subject-Level Single-Cell Expression Data","authors":"Qiuyu Wu, Xiangyu Luo","doi":"10.5705/ss.202020.0337","DOIUrl":"https://doi.org/10.5705/ss.202020.0337","url":null,"abstract":"The advent of single-cell sequencing opens new avenues for personalized treatment. In this paper, we address a two-level clustering problem of simultaneous subject subgroup discovery (subject level) and cell type detection (cell level) for single-cell expression data from multiple subjects. However, current statistical approaches either cluster cells without considering the subject heterogeneity or group subjects without using the single-cell information. To bridge the gap between cell clustering and subject grouping, we develop a nonparametric Bayesian model, Subject and Cell clustering for Single-Cell expression data (SCSC) model, to achieve subject and cell grouping simultaneously. SCSC does not need to prespecify the subject subgroup number or the cell type number. It automatically induces subject subgroup structures and matches cell types across subjects. Moreover, it directly models the single-cell raw count data by deliberately considering the data's dropouts, library sizes, and over-dispersion. A blocked Gibbs sampler is proposed for the posterior inference. Simulation studies and the application to a multi-subject iPSC scRNA-seq dataset validate the ability of SCSC to simultaneously cluster subjects and cells.","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135181000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peiyao Wang, Quefeng Li, Dinggang Shen, Yufeng Liu
{"title":"HIGH-DIMENSIONAL FACTOR REGRESSION FOR HETEROGENEOUS SUBPOPULATIONS.","authors":"Peiyao Wang, Quefeng Li, Dinggang Shen, Yufeng Liu","doi":"10.5705/ss.202020.0145","DOIUrl":"10.5705/ss.202020.0145","url":null,"abstract":"<p><p>In modern scientific research, data heterogeneity is commonly observed owing to the abundance of complex data. We propose a factor regression model for data with heterogeneous subpopulations. The proposed model can be represented as a decomposition of heterogeneous and homogeneous terms. The heterogeneous term is driven by latent factors in different subpopulations. The homogeneous term captures common variation in the covariates and shares common regression coefficients across subpopulations. Our proposed model attains a good balance between a global model and a group-specific model. The global model ignores the data heterogeneity, while the group-specific model fits each subgroup separately. We prove the estimation and prediction consistency for our proposed estimators, and show that it has better convergence rates than those of the group-specific and global models. We show that the extra cost of estimating latent factors is asymptotically negligible and the minimax rate is still attainable. We further demonstrate the robustness of our proposed method by studying its prediction error under a mis-specified group-specific model. Finally, we conduct simulation studies and analyze a data set from the Alzheimer's Disease Neuroimaging Initiative and an aggregated microarray data set to further demonstrate the competitiveness and interpretability of our proposed factor regression model.</p>","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"33 1","pages":"27-53"},"PeriodicalIF":1.4,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10583735/pdf/nihms-1892524.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49684205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical Likelihood Using External Summary Information","authors":"Lyu Ni, Junchao Shao, Jinyi Wang, Lei Wang","doi":"10.5705/ss.202023.0056","DOIUrl":"https://doi.org/10.5705/ss.202023.0056","url":null,"abstract":": Statistical analysis in modern scientific research nowadays has opportunities to utilize external summary information from similar studies to gain efficiency. However, the population generating data for current study, referred to as internal population, is typically different from the external population for summary information, although they share some common characteristics that make efficiency improvement possible. The existing population heterogeneity is a challenging issue especially when we have only summary statistics but not individual-level external data. In this paper, we apply an empirical likelihood approach to estimating internal population distribution, with external summary information utilized as constraints for efficiency gain under population heterogeneity. We show that our approach produces an asymptotically more efficient estimator of internal population distribution compared with the customary empirical likelihood without using any external information, under the condition that the external information is based on a dataset with size larger than that","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"1 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70939888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yijian Huang, Isaac Parakati, Dattatraya H Patil, Martin G Sanda
{"title":"Interval estimation for operating characteristic of continuous biomarkers with controlled sensitivity or specificity.","authors":"Yijian Huang, Isaac Parakati, Dattatraya H Patil, Martin G Sanda","doi":"10.5705/ss.202021.0020","DOIUrl":"10.5705/ss.202021.0020","url":null,"abstract":"<p><p>The receiver operating characteristic (ROC) curve provides a comprehensive performance assessment of a continuous biomarker over the full threshold spectrum. Nevertheless, a medical test often dictates to operate at a certain high level of sensitivity or specificity. A diagnostic accuracy metric directly targeting the clinical utility is specificity at the controlled sensitivity level, or vice versa. While the empirical point estimation is readily adopted in practice, the nonparametric interval estimation is challenged by the fact that the variance involves density functions due to estimated threshold. In addition, even with a fixed threshold, many standard confidence intervals including the Wald interval for binomial proportion could have erratic behaviors. In this article, we are motivated by the superior performance of the score interval for binomial proportion and propose a novel extension for the biomarker problem. Meanwhile, we develop exact bootstrap and establish consistency of the bootstrap variance estimator. Both single-biomarker evaluation and two-biomarker comparison are investigated. Extensive simulation studies were conducted, demonstrating competitive performance of our proposals. An illustration with aggressive prostate cancer diagnosis is provided.</p>","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"33 1","pages":"193-214"},"PeriodicalIF":1.4,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181819/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9485519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-Dimensional Behaviour of Some Two-Sample Tests Based on Ball Divergence","authors":"Bilol Banerjee, A. Ghosh","doi":"10.5705/ss.202023.0069","DOIUrl":"https://doi.org/10.5705/ss.202023.0069","url":null,"abstract":"","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"1 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70939954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Modeling of Change-Point Identification and Dependent Dynamic Community Detection","authors":"Diqing Li, Yubai Yuan, Xinsheng Zhang, Annie Qu","doi":"10.5705/ss.202021.0182","DOIUrl":"https://doi.org/10.5705/ss.202021.0182","url":null,"abstract":": The field of dynamic network analysis has recently seen a surge of interest in community detection and evolution. However, existing methods for dynamic community detection do not consider dependencies between edges, which could lead to a loss of information when detecting community structures. In this study, we investigate the problem of identifying a change-point with abrupt changes in the community structure of a network. To do so, we propose an approximate likelihood approach for the change-point estimator and for identifying node membership that integrates marginal information and dependencies of network connectivities. We propose an expectation-maximization-type algorithm that maximizes the approximate likelihood jointly over change-point and community membership evolution. From a theoretical viewpoint, we establish estimation consistency under the regularity condition, and show that the proposed estimators achieve a higher convergence rate than those of their marginal likelihood counterparts, which do not incorporate dependencies between edges. We demonstrate the validity of the proposed method by applying it to the ADHD-200 data set to detect brain functional community changes over time.","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"1 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70937729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Identifiability of Copula Models for Dependent Competing Risks Data With Exponentially Distributed Margins","authors":"Antai Wang","doi":"10.5705/ss.202020.0520","DOIUrl":"https://doi.org/10.5705/ss.202020.0520","url":null,"abstract":"","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135182938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Construction Method for Maximin L1-Distance Latin Hypercube Designs","authors":"Ru Yuan, Yuhao Yin, Hongquan Xu, Min-Qian Liu","doi":"10.5705/ss.202022.0263","DOIUrl":"https://doi.org/10.5705/ss.202022.0263","url":null,"abstract":"A Construction Method for Maximin L1-Distance Latin Hypercube Designs","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"1 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70938819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}