Annals of Applied Statistics最新文献

筛选
英文 中文
ANOPOW FOR REPLICATED NONSTATIONARY TIME SERIES IN EXPERIMENTS. 用于实验中复制的非平稳时间序列的 anopow。
IF 1.8 4区 数学
Annals of Applied Statistics Pub Date : 2024-03-01 Epub Date: 2024-01-31 DOI: 10.1214/23-aoas1791
Zeda Li, Yu Ryan Yue, Scott A Bruce
{"title":"ANOPOW FOR REPLICATED NONSTATIONARY TIME SERIES IN EXPERIMENTS.","authors":"Zeda Li, Yu Ryan Yue, Scott A Bruce","doi":"10.1214/23-aoas1791","DOIUrl":"10.1214/23-aoas1791","url":null,"abstract":"<p><p>We propose a novel analysis of power (ANOPOW) model for analyzing replicated nonstationary time series commonly encountered in experimental studies. Based on a locally stationary ANOPOW Cramér spectral representation, the proposed model can be used to compare the second-order time-varying frequency patterns among different groups of time series and to estimate group effects as functions of both time and frequency. Formulated in a Bayesian framework, independent two-dimensional second-order random walk (RW2D) priors are assumed on each of the time-varying functional effects for flexible and adaptive smoothing. A piecewise stationary approximation of the nonstationary time series is used to obtain localized estimates of time-varying spectra. Posterior distributions of the time-varying functional group effects are then obtained via integrated nested Laplace approximations (INLA) at a low computational cost. The large-sample distribution of local periodograms can be appropriately utilized to improve estimation accuracy since INLA allows modeling of data with various types of distributions. The usefulness of the proposed model is illustrated through two real data applications: analyses of seismic signals and pupil diameter time series in children with attention deficit hyperactivity disorder. Simulation studies, Supplementary Materials (Li, Yue and Bruce, 2023a), and R code (Li, Yue and Bruce, 2023b) for this article are also available.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"18 1","pages":"328-349"},"PeriodicalIF":1.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10906746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140023131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LAND-USE FILTERING FOR NONSTATIONARY SPATIAL PREDICTION OF COLLECTIVE EFFICACY IN AN URBAN ENVIRONMENT. 利用土地利用滤波技术对城市环境中的集体效能进行非稳态空间预测。
IF 1.8 4区 数学
Annals of Applied Statistics Pub Date : 2024-03-01 Epub Date: 2024-01-31 DOI: 10.1214/23-aoas1813
J Brandon Carter, Christopher R Browning, Bethany Boettner, Nicolo Pinchak, Catherine A Calder
{"title":"LAND-USE FILTERING FOR NONSTATIONARY SPATIAL PREDICTION OF COLLECTIVE EFFICACY IN AN URBAN ENVIRONMENT.","authors":"J Brandon Carter, Christopher R Browning, Bethany Boettner, Nicolo Pinchak, Catherine A Calder","doi":"10.1214/23-aoas1813","DOIUrl":"10.1214/23-aoas1813","url":null,"abstract":"<p><p>Collective efficacy-the capacity of communities to exert social control toward the realization of their shared goals-is a foundational concept in the urban sociology and neighborhood effects literature. Traditionally, empirical studies of collective efficacy use large sample surveys to estimate collective efficacy of different neighborhoods within an urban setting. Such studies have demonstrated an association between collective efficacy and local variation in community violence, educational achievement, and health. Unlike traditional collective efficacy measurement strategies, the Adolescent Health and Development in Context (AHDC) Study implemented a new approach, obtaining spatially-referenced, place-based ratings of collective efficacy from a representative sample of individuals residing in Columbus, OH. In this paper we introduce a novel nonstationary spatial model for interpolation of the AHDC collective efficacy ratings across the study area, which leverages administrative data on land use. Our constructive model specification strategy involves dimension expansion of a latent spatial process and the use of a filter defined by the land-use partition of the study region to connect the latent multivariate spatial process to the observed ordinal ratings of collective efficacy. Careful consideration is given to the issues of parameter identifiability, computational efficiency of an MCMC algorithm for model fitting, and fine-scale spatial prediction of collective efficacy.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"18 1","pages":"794-818"},"PeriodicalIF":1.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11146085/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141238803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RETROSPECTIVE VARYING COEFFICIENT ASSOCIATION ANALYSIS OF LONGITUDINAL BINARY TRAITS: APPLICATION TO THE IDENTIFICATION OF GENETIC LOCI ASSOCIATED WITH HYPERTENSION. 纵向二元性状的回顾性变化系数关联分析:应用于确定与高血压相关的遗传位点。
IF 1.3 4区 数学
Annals of Applied Statistics Pub Date : 2024-03-01 Epub Date: 2024-01-31 DOI: 10.1214/23-aoas1798
Gang Xu, Amei Amei, Weimiao Wu, Yunqing Liu, Linchuan Shen, Edwin C Oh, Zuoheng Wang
{"title":"RETROSPECTIVE VARYING COEFFICIENT ASSOCIATION ANALYSIS OF LONGITUDINAL BINARY TRAITS: APPLICATION TO THE IDENTIFICATION OF GENETIC LOCI ASSOCIATED WITH HYPERTENSION.","authors":"Gang Xu, Amei Amei, Weimiao Wu, Yunqing Liu, Linchuan Shen, Edwin C Oh, Zuoheng Wang","doi":"10.1214/23-aoas1798","DOIUrl":"10.1214/23-aoas1798","url":null,"abstract":"<p><p>Many genetic studies contain rich information on longitudinal phenotypes that require powerful analytical tools for optimal analysis. Genetic analysis of longitudinal data that incorporates temporal variation is important for understanding the genetic architecture and biological variation of complex diseases. Most of the existing methods assume that the contribution of genetic variants is constant over time and fail to capture the dynamic pattern of disease progression. However, the relative influence of genetic variants on complex traits fluctuates over time. In this study, we propose a retrospective varying coefficient mixed model association test, RVMMAT, to detect time-varying genetic effect on longitudinal binary traits. We model dynamic genetic effect using smoothing splines, estimate model parameters by maximizing a double penalized quasi-likelihood function, design a joint test using a Cauchy combination method, and evaluate statistical significance via a retrospective approach to achieve robustness to model misspecification. Through simulations we illustrated that the retrospective varying-coefficient test was robust to model misspecification under different ascertainment schemes and gained power over the association methods assuming constant genetic effect. We applied RVMMAT to a genome-wide association analysis of longitudinal measure of hypertension in the Multi-Ethnic Study of Atherosclerosis. Pathway analysis identified two important pathways related to G-protein signaling and DNA damage. Our results demonstrated that RVMMAT could detect biologically relevant loci and pathways in a genome scan and provided insight into the genetic architecture of hypertension.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"18 1","pages":"487-505"},"PeriodicalIF":1.3,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10994004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140868741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
COMPOSITE SCORES FOR TRANSPLANT CENTER EVALUATION: A NEW INDIVIDUALIZED EMPIRICAL NULL METHOD. 用于移植中心评估的综合评分:一种新的个性化经验无效法。
IF 1.3 4区 数学
Annals of Applied Statistics Pub Date : 2024-03-01 Epub Date: 2024-01-31 DOI: 10.1214/23-aoas1809
Nicholas Hartman, Joseph M Messana, Jian Kang, Abhijit S Naik, Tempie H Shearon, Kevin He
{"title":"COMPOSITE SCORES FOR TRANSPLANT CENTER EVALUATION: A NEW INDIVIDUALIZED EMPIRICAL NULL METHOD.","authors":"Nicholas Hartman, Joseph M Messana, Jian Kang, Abhijit S Naik, Tempie H Shearon, Kevin He","doi":"10.1214/23-aoas1809","DOIUrl":"10.1214/23-aoas1809","url":null,"abstract":"<p><p>Risk-adjusted quality measures are used to evaluate healthcare providers with respect to national norms while controlling for factors beyond their control. Existing healthcare provider profiling approaches typically assume that the between-provider variation in these measures is entirely due to meaningful differences in quality of care. However, in practice, much of the between-provider variation will be due to trivial fluctuations in healthcare quality, or unobservable confounding risk factors. If these additional sources of variation are not accounted for, conventional methods will disproportionately identify larger providers as outliers, even though their departures from the national norms may not be \"extreme\" or clinically meaningful. Motivated by efforts to evaluate the quality of care provided by transplant centers, we develop a composite evaluation score based on a novel individualized empirical null method, which robustly accounts for overdispersion due to unobserved risk factors, models the marginal variance of standardized scores as a function of the effective sample size, and only requires the use of publicly-available center-level statistics. The evaluations of United States kidney transplant centers based on the proposed composite score are substantially different from those based on conventional methods. Simulations show that the proposed empirical null approach more accurately classifies centers in terms of quality of care, compared to existing methods.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"18 1","pages":"729-748"},"PeriodicalIF":1.3,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11395314/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142300086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A RIEMANN MANIFOLD MODEL FRAMEWORK FOR LONGITUDINAL CHANGES IN PHYSICAL ACTIVITY PATTERNS. 体育活动模式纵向变化的里曼流形模型框架。
IF 1.8 4区 数学
Annals of Applied Statistics Pub Date : 2023-12-01 Epub Date: 2023-10-30 DOI: 10.1214/23-aoas1758
Jingjing Zou, Tuo Lin, Chongzhi Di, John Bellettiere, Marta M Jankowska, Sheri J Hartman, Dorothy D Sears, Andrea Z LaCroix, Cheryl L Rock, Loki Natarajan
{"title":"A RIEMANN MANIFOLD MODEL FRAMEWORK FOR LONGITUDINAL CHANGES IN PHYSICAL ACTIVITY PATTERNS.","authors":"Jingjing Zou, Tuo Lin, Chongzhi Di, John Bellettiere, Marta M Jankowska, Sheri J Hartman, Dorothy D Sears, Andrea Z LaCroix, Cheryl L Rock, Loki Natarajan","doi":"10.1214/23-aoas1758","DOIUrl":"10.1214/23-aoas1758","url":null,"abstract":"<p><p>Physical activity (PA) is significantly associated with many health outcomes. The wide usage of wearable accelerometer-based activity trackers in recent years has provided a unique opportunity for in-depth research on PA and its relations with health outcomes and interventions. Past analysis of activity tracker data relies heavily on aggregating minute-level PA records into day-level summary statistics in which important information of PA temporal/diurnal patterns is lost. In this paper we propose a novel functional data analysis approach based on Riemann manifolds for modeling PA and its longitudinal changes. We model smoothed minute-level PA of a day as one-dimensional Riemann manifolds and longitudinal changes in PA in different visits as deformations between manifolds. The variability in changes of PA among a cohort of subjects is characterized via variability in the deformation. Functional principal component analysis is further adopted to model the deformations, and PC scores are used as a proxy in modeling the relation between changes in PA and health outcomes and/or interventions. We conduct comprehensive analyses on data from two clinical trials: Reach for Health (RfH) and Metabolism, Exercise and Nutrition at UCSD (MENU), focusing on the effect of interventions on longitudinal changes in PA patterns and how different modes of changes in PA influence weight loss, respectively. The proposed approach reveals unique modes of changes, including overall enhanced PA, boosted morning PA, and shifts of active hours specific to each study cohort. The results bring new insights into the study of longitudinal changes in PA and health and have the potential to facilitate designing of effective health interventions and guidelines.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"17 4","pages":"3216-3240"},"PeriodicalIF":1.8,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11149895/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141249006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PAIRWISE NONLINEAR DEPENDENCE ANALYSIS OF GENOMIC DATA. 基因组数据的两两非线性相关性分析。
IF 1.3 4区 数学
Annals of Applied Statistics Pub Date : 2023-12-01 Epub Date: 2023-10-30 DOI: 10.1214/23-aoas1745
Siqi Xiang, Wan Zhang, Siyao Liu, Katherine A Hoadley, Charles M Perou, Kai Zhang, J S Marron
{"title":"PAIRWISE NONLINEAR DEPENDENCE ANALYSIS OF GENOMIC DATA.","authors":"Siqi Xiang, Wan Zhang, Siyao Liu, Katherine A Hoadley, Charles M Perou, Kai Zhang, J S Marron","doi":"10.1214/23-aoas1745","DOIUrl":"10.1214/23-aoas1745","url":null,"abstract":"<p><p>In The Cancer Genome Atlas (TCGA) data set, there are many interesting nonlinear dependencies between pairs of genes that reveal important relationships and subtypes of cancer. Such genomic data analysis requires a rapid, powerful and interpretable detection process, especially in a high-dimensional environment. We study the nonlinear patterns among the expression of pairs of genes from TCGA using a powerful tool called Binary Expansion Testing. We find many nonlinear patterns, some of which are driven by known cancer subtypes, some of which are novel.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"17 4","pages":"2924-2943"},"PeriodicalIF":1.3,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688600/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Binned multinomial logistic regression for integrative cell-type annotation. 综合细胞类型标注的分类多项式逻辑回归。
IF 1.3 4区 数学
Annals of Applied Statistics Pub Date : 2023-12-01 DOI: 10.1214/23-aoas1769
Keshav Motwani, Rhonda Bacher, Aaron J Molstad
{"title":"Binned multinomial logistic regression for integrative cell-type annotation.","authors":"Keshav Motwani, Rhonda Bacher, Aaron J Molstad","doi":"10.1214/23-aoas1769","DOIUrl":"https://doi.org/10.1214/23-aoas1769","url":null,"abstract":"<p><p>Categorizing individual cells into one of many known cell type categories, also known as cell type annotation, is a critical step in the analysis of single-cell genomics data. The current process of annotation is time-intensive and subjective, which has led to different studies describing cell types with labels of varying degrees of resolution. While supervised learning approaches have provided automated solutions to annotation, there remains a significant challenge in fitting a unified model for multiple datasets with inconsistent labels. In this article, we propose a new multinomial logistic regression estimator which can be used to model cell type probabilities by integrating multiple datasets with labels of varying resolution. To compute our estimator, we solve a nonconvex optimization problem using a blockwise proximal gradient descent algorithm. We show through simulation studies that our approach estimates cell type probabilities more accurately than competitors in a wide variety of scenarios. We apply our method to ten single-cell RNA-seq datasets and demonstrate its utility in predicting fine resolution cell type labels on unlabeled data as well as refining cell type labels on data with existing coarse resolution annotations. Finally, we demonstrate that our method can lead to novel scientific insights in the context of a differential expression analysis comparing peripheral blood gene expression before and after treatment with interferon- <math><mi>β</mi></math> . An R package implementing the method is available at https://github.com/keshav-motwani/IBMR and the collection of datasets we analyze is available at https://github.com/keshav-motwani/AnnotatedPBMC.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"17 4","pages":"3426-3449"},"PeriodicalIF":1.3,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11981643/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143993314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TARGETING UNDERREPRESENTED POPULATIONS IN PRECISION MEDICINE: A FEDERATED TRANSFER LEARNING APPROACH. 针对精准医疗中代表性不足的人群:一种联合转移学习方法。
IF 1.3 4区 数学
Annals of Applied Statistics Pub Date : 2023-12-01 Epub Date: 2023-10-30 DOI: 10.1214/23-AOAS1747
By Sai Li, Tianxi Cai, Rui Duan
{"title":"TARGETING UNDERREPRESENTED POPULATIONS IN PRECISION MEDICINE: A FEDERATED TRANSFER LEARNING APPROACH.","authors":"By Sai Li, Tianxi Cai, Rui Duan","doi":"10.1214/23-AOAS1747","DOIUrl":"10.1214/23-AOAS1747","url":null,"abstract":"<p><p>The limited representation of minorities and disadvantaged populations in large-scale clinical and genomics research poses a significant barrier to translating precision medicine research into practice. Prediction models are likely to underperform in underrepresented populations due to heterogeneity across populations, thereby exacerbating known health disparities. To address this issue, we propose FETA, a two-way data integration method that leverages a federated transfer learning approach to integrate heterogeneous data from diverse populations and multiple healthcare institutions, with a focus on a target population of interest having limited sample sizes. We show that FETA achieves performance comparable to the pooled analysis, where individual-level data is shared across institutions, with only a small number of communications across participating sites. Our theoretical analysis and simulation study demonstrate how FETA's estimation accuracy is influenced by communication budgets, privacy restrictions, and heterogeneity across populations. We apply FETA to multisite data from the electronic Medical Records and Genomics (eMERGE) Network to construct genetic risk prediction models for extreme obesity. Compared to models trained using target data only, source data only, and all data without accounting for population-level differences, FETA shows superior predictive performance. FETA has the potential to improve estimation and prediction accuracy in underrepresented populations and reduce the gap in model performance across populations.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"17 4","pages":"2970-2992"},"PeriodicalIF":1.3,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11417462/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADDRESSING SELECTION BIAS AND MEASUREMENT ERROR IN COVID-19 CASE COUNT DATA USING AUXILIARY INFORMATION. 利用辅助信息解决 covid-19 病例计数数据中的选择偏差和测量误差。
IF 1.3 4区 数学
Annals of Applied Statistics Pub Date : 2023-12-01 Epub Date: 2023-10-30 DOI: 10.1214/23-aoas1744
Walter Dempsey
{"title":"ADDRESSING SELECTION BIAS AND MEASUREMENT ERROR IN COVID-19 CASE COUNT DATA USING AUXILIARY INFORMATION.","authors":"Walter Dempsey","doi":"10.1214/23-aoas1744","DOIUrl":"https://doi.org/10.1214/23-aoas1744","url":null,"abstract":"<p><p>Coronavirus case-count data has influenced government policies and drives most epidemiological forecasts. Limited testing is cited as the key driver behind minimal information on the COVID-19 pandemic. While expanded testing is laudable, measurement error and selection bias are the two greatest problems limiting our understanding of the COVID-19 pandemic; neither can be fully addressed by increased testing capacity. In this paper, we demonstrate their impact on estimation of point prevalence and the effective reproduction number. We show that estimates based on the millions of molecular tests in the US has the same mean square error as a small simple random sample. To address this, a procedure is presented that combines case-count data and random samples over time to estimate selection propensities based on key covariate information. We then combine these selection propensities with epidemiological forecast models to construct a <i>doubly robust</i> estimation method that accounts for both measurement-error and selection bias. This method is then applied to estimate Indiana's active infection prevalence using case-count, hospitalization, and death data with demographic information, a statewide random molecular sample collected from April 25-29th, and Delphi's COVID-19 Trends and Impact Survey. We end with a series of recommendations based on the proposed methodology.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"17 4","pages":"2903-2923"},"PeriodicalIF":1.3,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11210953/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141472276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GENERALIZED MATRIX DECOMPOSITION REGRESSION: ESTIMATION AND INFERENCE FOR TWO-WAY STRUCTURED DATA. 广义矩阵分解回归:双向结构化数据的估计和推断。
IF 1.3 4区 数学
Annals of Applied Statistics Pub Date : 2023-12-01 Epub Date: 2023-10-30 DOI: 10.1214/23-aoas1746
Yue Wang, Ali Shojaie, Timothy Randolph, Parker Knight, Jing Ma
{"title":"GENERALIZED MATRIX DECOMPOSITION REGRESSION: ESTIMATION AND INFERENCE FOR TWO-WAY STRUCTURED DATA.","authors":"Yue Wang, Ali Shojaie, Timothy Randolph, Parker Knight, Jing Ma","doi":"10.1214/23-aoas1746","DOIUrl":"10.1214/23-aoas1746","url":null,"abstract":"<p><p>Motivated by emerging applications in ecology, microbiology, and neuroscience, this paper studies high-dimensional regression with two-way structured data. To estimate the high-dimensional coefficient vector, we propose the generalized matrix decomposition regression (GMDR) to efficiently leverage auxiliary information on row and column structures. GMDR extends the principal component regression (PCR) to two-way structured data, but unlike PCR, GMDR selects the components that are most predictive of the outcome, leading to more accurate prediction. For inference on regression coefficients of individual variables, we propose the generalized matrix decomposition inference (GMDI), a general high-dimensional inferential framework for a large family of estimators that include the proposed GMDR estimator. GMDI provides more flexibility for incorporating relevant auxiliary row and column structures. As a result, GMDI does not require the true regression coefficients to be sparse, but constrains the coordinate system representing the regression coefficients according to the column structure. GMDI also allows dependent and heteroscedastic observations. We study the theoretical properties of GMDI in terms of both the type-I error rate and power and demonstrate the effectiveness of GMDR and GMDI in simulation studies and an application to human microbiome data.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"17 4","pages":"2944-2969"},"PeriodicalIF":1.3,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10751029/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139040863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信