D. Pellin, L. Biasco, Serena Scala, C. Di Serio, E. Wit
{"title":"Tracking hematopoietic stem cell evolution in a Wiskott–Aldrich clinical trial","authors":"D. Pellin, L. Biasco, Serena Scala, C. Di Serio, E. Wit","doi":"10.1214/22-aoas1686","DOIUrl":"https://doi.org/10.1214/22-aoas1686","url":null,"abstract":"Hematopoietic Stem Cells (HSC) are the cells that give rise to 7 all other blood cells and, as such, they are crucial in the healthy 8 development of individuals. Wiskott-Aldrich Syndrome (WAS) is a 9 severe disorder affecting the regulation of hematopoietic cells and is 10 caused by mutations in the WASP gene. We consider data from a 11 revolutionary gene therapy clinical trial, where HSC harvested from 12 3 WAS patients’ bone marrow have been edited and corrected using 13 viral vectors. Upon re-infusion into the patient, the HSC multiply 14 and differentiate into other cell types. The aim is to unravel the cell 15 multiplication and cell differentiation process, which has until now 16 remained elusive. 17 This paper models the replenishment of blood lineages resulting 18 from corrected HSC via a multivariate, density-dependent Markov 19 process and develops an inferential procedure to estimate the dy- 20 namic parameters given a set of temporally sparsely observed tra- 21 jectories. Starting from the master equation, we derive a system of 22 non-linear differential equations for the evolution of the first- and 23 second-order moments over time. We use these moment equations in 24 a generalized method-of-moments framework to perform inference. 25 The performance of our proposal has been evaluated by consider- 26 ing different sampling scenarios and measurement errors of various 27 strengths using a simulation study. We also compared it to another 28 state-of-the-art approach and found that our method is statistically 29 more efficient. By applying our method to the WAS gene therapy 30 data we found strong evidence for a myeloid-based developmental","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121624235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixed-frequency extreme value regression: Estimating the effect of mesoscale convective systems on extreme rainfall intensity","authors":"D. Dupuis, L. Trapin","doi":"10.1214/22-aoas1675","DOIUrl":"https://doi.org/10.1214/22-aoas1675","url":null,"abstract":"","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129771677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jung-Yeon Won, Michael R. Elliott, Emma V. Sanchez-Vaznaugh, Brisa N. Sánchez
{"title":"Integrating multiple built environment data sources","authors":"Jung-Yeon Won, Michael R. Elliott, Emma V. Sanchez-Vaznaugh, Brisa N. Sánchez","doi":"10.1214/22-aoas1692","DOIUrl":"https://doi.org/10.1214/22-aoas1692","url":null,"abstract":"","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"201 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133319752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A tensor decomposition model for longitudinal microbiome studies","authors":"Siyuan Ma, Hongzhe Li","doi":"10.1214/22-aoas1661","DOIUrl":"https://doi.org/10.1214/22-aoas1661","url":null,"abstract":"Longitudinal microbiome studies can help delineate true biological signals from the high interindividual variability that is common in microbiome data. However, there are few methods available for unsupervised dimension reduction of time course microbial abundance observations. Existing methods do not fully observe the distribution characteristics of such data types, namely, zero-inflation, compositionality, and overdispersion. We present a tensor decomposition model and a semiparametric quasi-likelihood estimation method for the decomposition of longitudinal microbiome data, by gen-eralizing existing approaches in tensor decomposition of Gaussian data. Optimization is performed through projected gradient descent additionally allowing interpretability constraints. We show through simulation studies our method is able to recover low rank structures from microbiome time course data, better than existing approaches. Lastly, we apply our method to two existing longitudinal microbiome studies, to detect global microbial changes associated with dietary and pharmaceutical effects, as well as infant birth modes.","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129663135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging Hardy–Weinberg disequilibrium for association testing in case-control studies","authors":"Lin Zhang, Lisa J. Strug, Lei Sun","doi":"10.1214/22-aoas1695","DOIUrl":"https://doi.org/10.1214/22-aoas1695","url":null,"abstract":"Modern genome-wide association studies (GWAS) remove single nucleotide polymorphisms (SNPs) that are in Hardy–Weinberg disequilibrium (HWD), despite limited rigor for this practice. In a case-control GWAS, although HWD in the control sample is an evidence for genotyping error, a truly associated SNP may be in HWD in the case and/or control populations. We, therefore, develop a new case-control association test that: (i) leverages HWD attributed to true association to increase power, (ii) is robust to HWD caused by genotyping error, and (iii) is easy-to-implement at the genome-wide level. The proposed robust allele-based joint test incorporates the difference in HWD between the case and control samples into the traditional association measure to gain power. We provide the asymptotic distribution of the proposed test statistic under the null hypothesis. We evaluate its type 1 error control at the genome-wide significance level of 5×10−8 in the presence of HWD attributed to factors unrelated to phenotype-genotype association, such as genotyping error. Finally, we demonstrate that the power of the proposed allele-based joint test is higher than the standard association test for a variety of genetic models, through derivations of the noncentrality parameters of the tests, as well as simulation and application studies.","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136178785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A rotation-based feature and Bayesian hierarchical model for the forensic evaluation of handwriting evidence in a closed set","authors":"Amy M. Crawford, Danica M. Ommen, A. Carriquiry","doi":"10.1214/22-aoas1662","DOIUrl":"https://doi.org/10.1214/22-aoas1662","url":null,"abstract":"","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114502557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bayesian panel vector autoregression to analyze the impact of climate shocks on high-income economies","authors":"Florian Huber, Tamás Krisztin, Michael Pfarrhofer","doi":"10.1214/22-aoas1681","DOIUrl":"https://doi.org/10.1214/22-aoas1681","url":null,"abstract":"In this paper we assess the impact of climate shocks on futures markets for agricultural commodities and a set of macroeconomic quantities for multiple high-income economies. To capture relations among countries, markets, and climate shocks, this paper proposes parsimonious methods to estimate high-dimensional panel vector autoregressions. We assume that coefficients associated with domestic lagged endogenous variables arise from a Gaussian mixture model while further parsimony is achieved using suitable global-local shrinkage priors on several regions of the parameter space. Our results point toward pronounced global reactions of key macroeconomic quantities to climate shocks. Moreover, the empirical findings highlight substantial linkages between regionally located shocks and global commodity markets.","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134104605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The risk of maternal complications after cesarean delivery: Near-far matching for instrumental variables study designs with large observational datasets","authors":"Ruoqi Yu, R. Kelz, S. Lorch, Luke J. Keele","doi":"10.1214/22-aoas1691","DOIUrl":"https://doi.org/10.1214/22-aoas1691","url":null,"abstract":"Cesarean delivery is used when there are problems with the placenta or umbilical cord, for twin pregnancies, and breech births. How-ever, research has found that Cesarean delivery increases the risk of maternal complications like blood transfusions and admission to the intensive care unit. Here, we study whether Cesarean delivery increases the risk of maternal complications using an instrumental variables study design to reduce bias from unobserved confounders. We use a variant of matching – near-far matching – to render our study design more plausible. In a near-far match, the investigator seeks to strengthen the effect of the instrument on the exposure while balanc-ing observable characteristics between groups of subjects with low and high values of the instrument. Extant near-far matching methods are computationally intensive for large data sets, and computing time can be very lengthy. To reduce the computational complexity of near-far matching in large observational studies, we apply an iterative form of Glover’s algorithm for a doubly convex bipartite graph to de-termine an optimal reverse caliper for the instrument, which reduces the number of candidate matches and allows for an optimal match in a large but much sparser graph. We also incorporate a variety of balance constraints, including exact matching, fine and near-fine balance, and covariate balance prioritization. We illustrate this new matching method using medical claims data from Pennsylvania, New York, and Florida. In our application, we match on physician’s pref-erences for delivery via Cesarean section, which is the instrument in our study. We compare the computing time from our match to extant methods, and we find that we can reduce the computational time required for the match by more than 11 hours. If our matched sample came from a paired randomized experiment, we could conclude that Cesarean delivery elevates the risk of maternal complications and increases the time spent in the hospital. Sensitivity analysis shows that the estimates for complications could be the result of a minor amount of confounding due to an unobserved covariate. The effects on the length of stay outcome, however, are more insensitive to hidden confounders.","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121439700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation and inference for exposure effects with latency in the Cox proportional hazards model in the presence of exposure measurement error","authors":"S. Peskoe, Ning Zhang, D. Spiegelman, Molin Wang","doi":"10.1214/22-aoas1682","DOIUrl":"https://doi.org/10.1214/22-aoas1682","url":null,"abstract":"","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117245404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinyuan Chen, Yiwei Li, Xiangnan Feng, Joseph T. Chang
{"title":"Variational Bayesian analysis of nonhomogeneous hidden Markov models with long and ultralong sequences","authors":"Xinyuan Chen, Yiwei Li, Xiangnan Feng, Joseph T. Chang","doi":"10.1214/22-aoas1685","DOIUrl":"https://doi.org/10.1214/22-aoas1685","url":null,"abstract":"","PeriodicalId":188068,"journal":{"name":"The Annals of Applied Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128971064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}