BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae169
Pan Liu, Yaguang Li, Jialiang Li
{"title":"Change surface regression for nonlinear subgroup identification with application to warfarin pharmacogenomics data.","authors":"Pan Liu, Yaguang Li, Jialiang Li","doi":"10.1093/biomtc/ujae169","DOIUrl":"https://doi.org/10.1093/biomtc/ujae169","url":null,"abstract":"<p><p>Pharmacogenomics stands as a pivotal driver toward personalized medicine, aiming to optimize drug efficacy while minimizing adverse effects by uncovering the impact of genetic variations on inter-individual outcome variability. Despite its promise, the intricate landscape of drug metabolism introduces complexity, where the correlation between drug response and genes can be shaped by numerous nongenetic factors, often exhibiting heterogeneity across diverse subpopulations. This challenge is particularly pronounced in datasets such as the International Warfarin Pharmacogenetic Consortium (IWPC), which encompasses diverse patient information from multiple nations. To capture the between-patient heterogeneity in dosing requirement, we formulate a novel change surface model as a model-based approach for multiple subgroup identification in complex datasets. A key feature of our approach is its ability to accommodate nonlinear subgroup divisions, providing a clearer understanding of dynamic drug-gene associations. Furthermore, our model effectively handles high-dimensional data through a doubly penalized approach, ensuring both interpretability and adaptability. We propose an iterative 2-stage method that combines a change point detection technique in the first stage with a smoothed local adaptive majorize-minimization algorithm for surface regression in the second stage. Performance of the proposed methods is evaluated through extensive numerical studies. Application of our method to the IWPC dataset leads to significant new findings, where 3 subgroups subject to different pharmacogenomic relationships are identified, contributing valuable insights into the complex dynamics of drug-gene associations in patients.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142999226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae163
Yi Cao, Pedro L Gozalo, Roee Gutman
{"title":"Causal inference with cross-temporal design.","authors":"Yi Cao, Pedro L Gozalo, Roee Gutman","doi":"10.1093/biomtc/ujae163","DOIUrl":"10.1093/biomtc/ujae163","url":null,"abstract":"<p><p>When many participants in a randomized trial do not comply with their assigned intervention, the randomized encouragement design is a possible solution. In this design, the causal effects of the intervention can be estimated among participants who would have experienced the intervention if encouraged. For many policy interventions, encouragements cannot be randomized and investigators need to rely on observational data. To address this, we propose a cross-temporal design, which uses time to mimic a randomized encouragement experiment. However, time may be confounded with temporal trends that influence the outcomes. To disentangle these trends from the intervention effects, we replace the commonly used exclusion restrictions with temporal assumptions. We develop Bayesian procedures to estimate the causal effects and compare it to instrumental variables and matching approaches in simulations. The Bayesian approach outperforms the other 2 approaches in terms of estimation accuracy, and it is relatively robust to various violations of the common trends assumption. Taking advantage of the expansion of the Medicare Advantage (MA) program between 2011 and 2017, we implement the proposed method to estimate the effects of MA enrollment on the risk of skilled nursing facility residents being re-hospitalized within 30 days after discharge from the hospital.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11725568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae164
Xin Chen, Hua Liu, Jiaqi Men, Jinhong You
{"title":"High-dimensional partially linear functional Cox models.","authors":"Xin Chen, Hua Liu, Jiaqi Men, Jinhong You","doi":"10.1093/biomtc/ujae164","DOIUrl":"https://doi.org/10.1093/biomtc/ujae164","url":null,"abstract":"<p><p>As a commonly employed method for analyzing time-to-event data involving functional predictors, the functional Cox model assumes a linear relationship between the functional principal component (FPC) scores of the functional predictors and the hazard rates. However, in practical scenarios, such as our study on the survival time of kidney transplant recipients, this assumption often fails to hold. To address this limitation, we introduce a class of high-dimensional partially linear functional Cox models, which accommodates the non-linear effects of functional predictors on the response and allows for diverging numbers of scalar predictors and FPCs as the sample size increases. We employ the group smoothly clipped absolute deviation method to select relevant scalar predictors and FPCs, and use B-splines to obtain a smoothed estimate of the non-linear effect. The finite sample performance of the estimates is evaluated through simulation studies. The model is also applied to a kidney transplant dataset, allowing us to make inferences about the non-linear effects of functional predictors on patients' hazard rates, as well as to identify significant scalar predictors for long-term survival time.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142977394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae165
Ajmery Jaman, Guanbo Wang, Ashkan Ertefaie, Michèle Bally, Renée Lévesque, Robert W Platt, Mireille E Schnitzer
{"title":"Penalized G-estimation for effect modifier selection in a structural nested mean model for repeated outcomes.","authors":"Ajmery Jaman, Guanbo Wang, Ashkan Ertefaie, Michèle Bally, Renée Lévesque, Robert W Platt, Mireille E Schnitzer","doi":"10.1093/biomtc/ujae165","DOIUrl":"https://doi.org/10.1093/biomtc/ujae165","url":null,"abstract":"<p><p>Effect modification occurs when the impact of the treatment on an outcome varies based on the levels of other covariates known as effect modifiers. Modeling these effect differences is important for etiological goals and for purposes of optimizing treatment. Structural nested mean models (SNMMs) are useful causal models for estimating the potentially heterogeneous effect of a time-varying exposure on the mean of an outcome in the presence of time-varying confounding. A data-adaptive selection approach is necessary if the effect modifiers are unknown a priori and need to be identified. Although variable selection techniques are available for estimating the conditional average treatment effects using marginal structural models or for developing optimal dynamic treatment regimens, all of these methods consider a single end-of-follow-up outcome. In the context of an SNMM for repeated outcomes, we propose a doubly robust penalized G-estimator for the causal effect of a time-varying exposure with a simultaneous selection of effect modifiers and prove the oracle property of our estimator. We conduct a simulation study for the evaluation of its performance in finite samples and verification of its double-robustness property. Our work is motivated by the study of hemodiafiltration for treating patients with end-stage renal disease at the Centre Hospitalier de l'Université de Montréal. We apply the proposed method to investigate the effect heterogeneity of dialysis facility on the repeated session-specific hemodiafiltration outcomes.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142999234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae161
Jian Sun, Bo Fu, Li Su
{"title":"Weighted Q-learning for optimal dynamic treatment regimes with nonignorable missing covariates.","authors":"Jian Sun, Bo Fu, Li Su","doi":"10.1093/biomtc/ujae161","DOIUrl":"https://doi.org/10.1093/biomtc/ujae161","url":null,"abstract":"<p><p>Dynamic treatment regimes (DTRs) formalize medical decision-making as a sequence of rules for different stages, mapping patient-level information to recommended treatments. In practice, estimating an optimal DTR using observational data from electronic medical record (EMR) databases can be complicated by nonignorable missing covariates resulting from informative monitoring of patients. Since complete case analysis can provide consistent estimation of outcome model parameters under the assumption of outcome-independent missingness, Q-learning is a natural approach to accommodating nonignorable missing covariates. However, the backward induction algorithm used in Q-learning can introduce challenges, as nonignorable missing covariates at later stages can result in nonignorable missing pseudo-outcomes at earlier stages, leading to suboptimal DTRs, even if the longitudinal outcome variables are fully observed. To address this unique missing data problem in DTR settings, we propose 2 weighted Q-learning approaches where inverse probability weights for missingness of the pseudo-outcomes are obtained through estimating equations with valid nonresponse instrumental variables or sensitivity analysis. The asymptotic properties of the weighted Q-learning estimators are derived, and the finite-sample performance of the proposed methods is evaluated and compared with alternative methods through extensive simulation studies. Using EMR data from the Medical Information Mart for Intensive Care database, we apply the proposed methods to investigate the optimal fluid strategy for sepsis patients in intensive care units.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae160
Tsung-Hung Yao, Yang Ni, Anindya Bhadra, Jian Kang, Veerabhadran Baladandayuthapani
{"title":"Robust Bayesian graphical regression models for assessing tumor heterogeneity in proteomic networks.","authors":"Tsung-Hung Yao, Yang Ni, Anindya Bhadra, Jian Kang, Veerabhadran Baladandayuthapani","doi":"10.1093/biomtc/ujae160","DOIUrl":"https://doi.org/10.1093/biomtc/ujae160","url":null,"abstract":"<p><p>Graphical models are powerful tools to investigate complex dependency structures in high-throughput datasets. However, most existing graphical models make one of two canonical assumptions: (i) a homogeneous graph with a common network for all subjects or (ii) an assumption of normality, especially in the context of Gaussian graphical models. Both assumptions are restrictive and can fail to hold in certain applications such as proteomic networks in cancer. To this end, we propose an approach termed robust Bayesian graphical regression (rBGR) to estimate heterogeneous graphs for non-normally distributed data. rBGR is a flexible framework that accommodates non-normality through random marginal transformations and constructs covariate-dependent graphs to accommodate heterogeneity through graphical regression techniques. We formulate a new characterization of edge dependencies in such models called conditional sign independence with covariates, along with an efficient posterior sampling algorithm. In simulation studies, we demonstrate that rBGR outperforms existing graphical regression models for data generated under various levels of non-normality in both edge and covariate selection. We use rBGR to assess proteomic networks in lung and ovarian cancers to systematically investigate the effects of immunogenic heterogeneity within tumors. Our analyses reveal several important protein-protein interactions that are differentially associated with the immune cell abundance; some corroborate existing biological knowledge, whereas others are novel findings.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae156
Wancen Mu, Jiawen Chen, Eric S Davis, Kathleen Reed, Douglas Phanstiel, Michael I Love, Didong Li
{"title":"Gaussian processes for time series with lead-lag effects with applications to biology data.","authors":"Wancen Mu, Jiawen Chen, Eric S Davis, Kathleen Reed, Douglas Phanstiel, Michael I Love, Didong Li","doi":"10.1093/biomtc/ujae156","DOIUrl":"10.1093/biomtc/ujae156","url":null,"abstract":"<p><p>Investigating the relationship, particularly the lead-lag effect, between time series is a common question across various disciplines, especially when uncovering biological processes. However, analyzing time series presents several challenges. Firstly, due to technical reasons, the time points at which observations are made are not at uniform intervals. Secondly, some lead-lag effects are transient, necessitating time-lag estimation based on a limited number of time points. Thirdly, external factors also impact these time series, requiring a similarity metric to assess the lead-lag relationship. To counter these issues, we introduce a model grounded in the Gaussian process, affording the flexibility to estimate lead-lag effects for irregular time series. In addition, our method outputs dissimilarity scores, thereby broadening its applications to include tasks such as ranking or clustering multiple pairwise time series when considering their strength of lead-lag effects with external factors. Crucially, we offer a series of theoretical proofs to substantiate the validity of our proposed kernels and the identifiability of kernel parameters. Our model demonstrates advances in various simulations and real-world applications, particularly in the study of dynamic chromatin interactions, compared to other leading methods.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11704948/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae168
Daxuan Deng, Lijun Zhang, Hao Feng, Vernon M Chinchilli, Chixiang Chen, Ming Wang
{"title":"Improving estimation efficiency for survival data analysis by integrating a coarsened time-to-event outcome from an external study.","authors":"Daxuan Deng, Lijun Zhang, Hao Feng, Vernon M Chinchilli, Chixiang Chen, Ming Wang","doi":"10.1093/biomtc/ujae168","DOIUrl":"10.1093/biomtc/ujae168","url":null,"abstract":"<p><p>In the era of big data, increasing availability of data makes combining different data sources to obtain more accurate estimations a popular topic. However, the development of data integration is often hindered by the heterogeneity in data forms across studies. In this paper, we focus on a case in survival analysis where we have primary study data with a continuous time-to-event outcome and complete covariate measurements, while the data from an external study contain an outcome observed at regular intervals, and only a subset of covariates is measured. To incorporate external information while accounting for the different data forms, we posit working models and obtain informative weights by empirical likelihood, which will be used to construct a weighted estimator in the main analysis. We have established the theory demonstrating that the new estimator has higher estimation efficiency compared to the conventional ones, and this advantage is robust to working model misspecification, as confirmed in our simulation studies. To assess its utility, we apply our method to accommodate data from the National Alzheimer's Coordinating Center to improve the analysis of the Alzheimer's Disease Neuroimaging Initiative Phase 1 study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11747882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142999230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae166
Jennifer F Bobb, Stephen J Mooney, Maricela Cruz, Anne Vernez Moudon, Adam Drewnowski, David Arterburn, Andrea J Cook
{"title":"Distributed lag models for retrospective cohort data with application to a study of built environment and body weight.","authors":"Jennifer F Bobb, Stephen J Mooney, Maricela Cruz, Anne Vernez Moudon, Adam Drewnowski, David Arterburn, Andrea J Cook","doi":"10.1093/biomtc/ujae166","DOIUrl":"10.1093/biomtc/ujae166","url":null,"abstract":"<p><p>Distributed lag models (DLMs) estimate the health effects of exposure over multiple time lags prior to the outcome and are widely used in time series studies. Applying DLMs to retrospective cohort studies is challenging due to inconsistent lengths of exposure history across participants, which is common when using electronic health record databases. A standard approach is to define subcohorts of individuals with some minimum exposure history, but this limits power and may amplify selection bias. We propose alternative full-cohort methods that use all available data while simultaneously enabling examination of the longest time lag estimable in the cohort. Through simulation studies, we find that restricting to a subcohort can lead to biased estimates of exposure effects due to confounding by correlated exposures at more distant lags. By contrast, full-cohort methods that incorporate multiple imputation of complete exposure histories can avoid this bias to efficiently estimate lagged and cumulative effects. Applying full-cohort DLMs to a study examining the association between residential density (a proxy for walkability) over 12 years and body weight, we find evidence of an immediate effect in the prior 1-2 years. We also observed an association at the maximal lag considered (12 years prior), which we posit reflects an earlier ($ge$12 years) or incrementally increasing prior effect over time. DLMs can be efficiently incorporated within retrospective cohort studies to identify critical windows of exposure.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11760659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143031922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae159
Emily C Hector, Brian J Reich, Ani Eloyan
{"title":"Distributed model building and recursive integration for big spatial data modeling.","authors":"Emily C Hector, Brian J Reich, Ani Eloyan","doi":"10.1093/biomtc/ujae159","DOIUrl":"https://doi.org/10.1093/biomtc/ujae159","url":null,"abstract":"<p><p>Motivated by the need for computationally tractable spatial methods in neuroimaging studies, we develop a distributed and integrated framework for estimation and inference of Gaussian process model parameters with ultra-high-dimensional likelihoods. We propose a shift in viewpoint from whole to local data perspectives that is rooted in distributed model building and integrated estimation and inference. The framework's backbone is a computationally and statistically efficient integration procedure that simultaneously incorporates dependence within and between spatial resolutions in a recursively partitioned spatial domain. Statistical and computational properties of our distributed approach are investigated theoretically and in simulations. The proposed approach is used to extract new insights into autism spectrum disorder from the autism brain imaging data exchange.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}