BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae116
Andrew S Whiteman, Timothy D Johnson, Jian Kang
{"title":"Bayesian inference for group-level cortical surface image-on-scalar regression with Gaussian process priors.","authors":"Andrew S Whiteman, Timothy D Johnson, Jian Kang","doi":"10.1093/biomtc/ujae116","DOIUrl":"10.1093/biomtc/ujae116","url":null,"abstract":"<p><p>In regression-based analyses of group-level neuroimage data, researchers typically fit a series of marginal general linear models to image outcomes at each spatially referenced pixel. Spatial regularization of effects of interest is usually induced indirectly by applying spatial smoothing to the data during preprocessing. While this procedure often works well, the resulting inference can be poorly calibrated. Spatial modeling of effects of interest leads to more powerful analyses; however, the number of locations in a typical neuroimage can preclude standard computing methods in this setting. Here, we contribute a Bayesian spatial regression model for group-level neuroimaging analyses. We induce regularization of spatially varying regression coefficient functions through Gaussian process priors. When combined with a simple non-stationary model for the error process, our prior hierarchy can lead to more data-adaptive smoothing than standard methods. We achieve computational tractability through a Vecchia-type approximation of our prior that retains full spatial rank and can be constructed for a wide class of spatial correlation functions. We outline several ways to work with our model in practice and compare performance against standard vertex-wise analyses and several alternatives. Finally, we illustrate our methods in an analysis of cortical surface functional magnetic resonance imaging task contrast data from a large cohort of children enrolled in the adolescent brain cognitive development study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11518852/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142520911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae112
Zhaotong Lin, Isaac Pan, Wei Pan
{"title":"On network deconvolution for undirected graphs.","authors":"Zhaotong Lin, Isaac Pan, Wei Pan","doi":"10.1093/biomtc/ujae112","DOIUrl":"10.1093/biomtc/ujae112","url":null,"abstract":"<p><p>Network deconvolution (ND) is a method to reconstruct a direct-effect network describing direct (or conditional) effects (or associations) between any two nodes from a given network depicting total (or marginal) effects (or associations). Its key idea is that, in a directed graph, a total effect can be decomposed into the sum of a direct and an indirect effects, with the latter further decomposed as the sum of various products of direct effects. This yields a simple closed-form solution for the direct-effect network, facilitating its important applications to distinguish direct and indirect effects. Despite its application to undirected graphs, it is not well known why the method works, leaving it with skepticism. We first clarify the implicit linear model assumption underlying ND, then derive a surprisingly simple result on the equivalence between ND and use of precision matrices, offering insightful justification and interpretation for the application of ND to undirected graphs. We also establish a formal result to characterize the effect of scaling a total-effect graph. Finally, leveraging large-scale genome-wide association study data, we show a novel application of ND to contrast marginal versus conditional genetic correlations between body height and risk of coronary artery disease; the results align with an inferred causal directed graph using ND. We conclude that ND is a promising approach with its easy and wide applicability to both directed and undirected graphs.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11459367/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142387636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae105
Shuqi Wang, Peter F Thall, Kentaro Takeda, Ying Yuan
{"title":"ROMI: a randomized two-stage basket trial design to optimize doses for multiple indications.","authors":"Shuqi Wang, Peter F Thall, Kentaro Takeda, Ying Yuan","doi":"10.1093/biomtc/ujae105","DOIUrl":"10.1093/biomtc/ujae105","url":null,"abstract":"<p><p>Optimizing doses for multiple indications is challenging. The pooled approach of finding a single optimal biological dose (OBD) for all indications ignores that dose-response or dose-toxicity curves may differ between indications, resulting in varying OBDs. Conversely, indication-specific dose optimization often requires a large sample size. To address this challenge, we propose a Randomized two-stage basket trial design that Optimizes doses in Multiple Indications (ROMI). In stage 1, for each indication, response and toxicity are evaluated for a high dose, which may be a previously obtained maximum tolerated dose, with a rule that stops accrual to indications where the high dose is unsafe or ineffective. Indications not terminated proceed to stage 2, where patients are randomized between the high dose and a specified lower dose. A latent-cluster Bayesian hierarchical model is employed to borrow information between indications, while considering the potential heterogeneity of OBD across indications. Indication-specific utilities are used to quantify response-toxicity trade-offs. At the end of stage 2, for each indication with at least one acceptable dose, the dose with highest posterior mean utility is selected as optimal. Two versions of ROMI are presented, one using only stage 2 data for dose optimization and the other optimizing doses using data from both stages. Simulations show that both versions have desirable operating characteristics compared to designs that either ignore indications or optimize dose independently for each indication.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11447723/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142364261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unlocking the power of multi-institutional data: Integrating and harmonizing genomic data across institutions.","authors":"Yuan Chen, Ronglai Shen, Xiwen Feng, Katherine Panageas","doi":"10.1093/biomtc/ujae146","DOIUrl":"10.1093/biomtc/ujae146","url":null,"abstract":"<p><p>Cancer is a complex disease driven by genomic alterations, and tumor sequencing is becoming a mainstay of clinical care for cancer patients. The emergence of multi-institution sequencing data presents a powerful resource for learning real-world evidence to enhance precision oncology. GENIE BPC, led by American Association for Cancer Research, establishes a unique database linking genomic data with clinical information for patients treated at multiple cancer centers. However, leveraging sequencing data from multiple institutions presents significant challenges. Variability in gene panels can lead to loss of information when analyses focus on genes common across panels. Additionally, differences in sequencing techniques and patient heterogeneity across institutions add complexity. High data dimensionality, sparse gene mutation patterns, and weak signals at the individual gene level further complicate matters. Motivated by these real-world challenges, we introduce the Bridge model. It uses a quantile-matched latent variable approach to derive integrated features to preserve information beyond common genes and maximize the utilization of all available data, while leveraging information sharing to enhance both learning efficiency and the model's capacity to generalize. By extracting harmonized and noise-reduced lower-dimensional latent variables, the true mutation pattern unique to each individual is captured. We assess model's performance and parameter estimation through extensive simulation studies. The extracted latent features from the Bridge model consistently excel in predicting patient survival across six cancer types in GENIE BPC data.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11647914/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae139
Anastasios A Tsiatis, Marie Davidian
{"title":"A generalized logrank-type test for comparison of treatment regimes in sequential multiple assignment randomized trials.","authors":"Anastasios A Tsiatis, Marie Davidian","doi":"10.1093/biomtc/ujae139","DOIUrl":"10.1093/biomtc/ujae139","url":null,"abstract":"<p><p>The sequential multiple assignment randomized trial (SMART) is the ideal study design for the evaluation of multistage treatment regimes, which comprise sequential decision rules that recommend treatments for a patient at each of a series of decision points based on their evolving characteristics. A common goal is to compare the set of so-called embedded regimes represented in the design on the basis of a primary outcome of interest. In the study of chronic diseases and disorders, this outcome is often a time to an event, and a goal is to compare the distributions of the time-to-event outcome associated with each regime in the set. We present a general statistical framework in which we develop a logrank-type test for comparison of the survival distributions associated with regimes within a specified set based on the data from a SMART with an arbitrary number of stages that allows incorporation of covariate information to enhance efficiency and can also be used with data from an observational study. The framework provides clarification of the assumptions required to yield a principled test procedure, and the proposed test subsumes or offers an improved alternative to existing methods. We demonstrate performance of the methods in a suite of simulation studies. The methods are applied to a SMART in patients with acute promyelocytic leukemia.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11636965/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142817045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae147
Wenlong Yang, Danping Liu, Le Bao, Runze Li
{"title":"A likelihood approach to incorporating self-report data in HIV recency classification.","authors":"Wenlong Yang, Danping Liu, Le Bao, Runze Li","doi":"10.1093/biomtc/ujae147","DOIUrl":"10.1093/biomtc/ujae147","url":null,"abstract":"<p><p>Estimating new HIV infections is significant yet challenging due to the difficulty in distinguishing between recent and long-term infections. We demonstrate that HIV recency status (recent versus long-term) could be determined from self-report testing history and biomarkers, which are increasingly available in bio-behavioral surveys. HIV recency status is partially observed, given the self-report testing history. For example, people who tested positive for HIV over 1 year ago should have a long-term infection. Based on the nationally representative samples collected by the Population-based HIV Impact Assessment (PHIA) Project, we propose a likelihood-based probabilistic model for HIV recency classification. The model incorporates individuals with known recency status based on testing histories and individuals whose recency status could not be determined and integrates the mechanism of how HIV recency status depends on biomarkers and the mechanism of how HIV recency status, together with the self-report time of the most recent HIV test, impacts the test results. We compare our method to logistic regression and the binary classification tree (current practice) on Malawi PHIA data, as well as on simulated data. Our model obtains more efficient and less biased parameter estimates and is relatively robust to potential reporting error and model misspecification.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11647912/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae137
Ioannis Oikonomidis, Samis Trevezas
{"title":"Cumulative link mixed-effects models in the service of remote sensing crop progress monitoring.","authors":"Ioannis Oikonomidis, Samis Trevezas","doi":"10.1093/biomtc/ujae137","DOIUrl":"https://doi.org/10.1093/biomtc/ujae137","url":null,"abstract":"<p><p>This study introduces an innovative cumulative link modeling (CLM) approach to monitor crop progress over large areas using remote sensing data. Two distinct models are developed, a fixed-effects CLM and a mixed-effects one that incorporates annual random effects to capture the inherent inter-seasonal variability. Inference is based on partial-likelihood with two law variations, the standard CLM based on the multinomial distribution and a novel one based on the product binomial distribution. Model performance is evaluated on eight crops, namely corn, oats, sorghum, soybeans, winter wheat, alfalfa, dry beans, and millet, using in-situ data from Nebraska, USA, spanning 20 years. The models utilize the predictive attributes of calendar time, thermal time, and the normalized difference vegetation index. The results demonstrate the wide applicability of this approach to different crops, providing large-scale predictions of crop progress and allowing the estimation of important agronomic parameters. To facilitate reproducibility, an ecosystem of R packages has been developed and made publicly accessible under the name Ages of Man. The packages can be utilized to implement the presented methodology in any area with this type of data, including the USA.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae150
R Dey, D E Schaubel, J A Hanley, P Saha-Chaudhuri
{"title":"Time-dependent prognostic accuracy measures for recurrent event data.","authors":"R Dey, D E Schaubel, J A Hanley, P Saha-Chaudhuri","doi":"10.1093/biomtc/ujae150","DOIUrl":"10.1093/biomtc/ujae150","url":null,"abstract":"<p><p>In many clinical contexts, the event of interest could occur multiple times for the same patient. Considerable advancement has been made on developing recurrent event models based on or that use biomarker information. However, less attention has been given to evaluating the prognostic accuracy of a biomarker or a composite score obtained from a fitted recurrent event-rate model. In this manuscript, we propose novel measures to characterize the prognostic accuracy of a marker measured at baseline in the presence of recurrent events. The proposed estimators are based on a semiparametric frailty model that accounts for the informativeness of a marker and unobserved heterogeneity among patients with respect to the rate of event occurrence. We investigate the asymptotic properties of the proposed accuracy estimators and demonstrate these estimators' finite sample performance through simulation studies. The proposed estimators have minimal bias and appropriate coverage. The estimators are applied to evaluate the performance of a baseline forced expiratory volume, a measure of lung capacity, for repeated episodes of pulmonary exacerbations in patients with cystic fibrosis.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11669850/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142891801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae158
Yeheng Ge, Tao Li, Xingdong Feng, Mengyun Wu, Hailong Liu
{"title":"Structured feature ranking for genomic marker identification accommodating multiple types of networks.","authors":"Yeheng Ge, Tao Li, Xingdong Feng, Mengyun Wu, Hailong Liu","doi":"10.1093/biomtc/ujae158","DOIUrl":"https://doi.org/10.1093/biomtc/ujae158","url":null,"abstract":"<p><p>Numerous statistical methods have been developed to search for genomic markers associated with the development, progression, and response to treatment of complex diseases. Among them, feature ranking plays a vital role due to its intuitive formulation and computational efficiency. However, most of the existing methods are based on the marginal importance of molecular predictors and share the limitation that the dependence (network) structures among predictors are not well accommodated, where a disease phenotype usually reflects various biological processes that interact in a complex network. In this paper, we propose a structured feature ranking method for identifying genomic markers, where such network structures are effectively accommodated using Laplacian regularization. The proposed method innovatively investigates multiple network scenarios, where the networks can be known a priori and data-dependently estimated. In addition, we rigorously explore the noise and uncertainty in the networks and control their impacts with proper selection of tuning parameters. These characteristics make the proposed method enjoy especially broad applicability. Theoretical result of our proposal is rigorously established. Compared to the original marginal measure, the proposed network structured measure can achieve sure screening properties with a faster convergence rate under mild conditions. Extensive simulations and analysis of The Cancer Genome Atlas melanoma data demonstrate the improvement of finite sample performance and practical usefulness of the proposed method.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142920686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2024-10-03DOI: 10.1093/biomtc/ujae120
Ziqi Chen, Yu Shen, Jing Qin, Jing Ning
{"title":"Likelihood adaptively incorporated external aggregate information with uncertainty for survival data.","authors":"Ziqi Chen, Yu Shen, Jing Qin, Jing Ning","doi":"10.1093/biomtc/ujae120","DOIUrl":"10.1093/biomtc/ujae120","url":null,"abstract":"<p><p>Population-based cancer registry databases are critical resources to bridge the information gap that results from a lack of sufficient statistical power from primary cohort data with small to moderate sample size. Although comprehensive data associated with tumor biomarkers often remain either unavailable or inconsistently measured in these registry databases, aggregate survival information sourced from these repositories has been well documented and publicly accessible. An appealing option is to integrate the aggregate survival information from the registry data with the primary cohort to enhance the evaluation of treatment impacts or prediction of survival outcomes across distinct tumor subtypes. Nevertheless, for rare types of cancer, even the sample sizes of cancer registries remain modest. The variability linked to the aggregated statistics could be non-negligible compared with the sample variation of the primary cohort. In response, we propose an externally informed likelihood approach, which facilitates the linkage between the primary cohort and external aggregate data, with consideration of the variation from aggregate information. We establish the asymptotic properties of the estimators and evaluate the finite sample performance via simulation studies. Through the application of our proposed method, we integrate data from the cohort of inflammatory breast cancer (IBC) patients at the University of Texas MD Anderson Cancer Center with aggregate survival data from the National Cancer Data Base, enabling us to appraise the effect of tri-modality treatment on survival across various tumor subtypes of IBC.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11518850/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142520913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}