Biostatistics最新文献

Incorporating prior information in gene expression network-based cancer heterogeneity analysis. 在基于基因表达网络的癌症异质性分析中纳入先验信息。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae028

Rong Li, Shaodong Xu, Yang Li, Zuojian Tang, Di Feng, James Cai, Shuangge Ma

{"title":"Incorporating prior information in gene expression network-based cancer heterogeneity analysis.","authors":"Rong Li, Shaodong Xu, Yang Li, Zuojian Tang, Di Feng, James Cai, Shuangge Ma","doi":"10.1093/biostatistics/kxae028","DOIUrl":"10.1093/biostatistics/kxae028","url":null,"abstract":"Cancer is molecularly heterogeneous, with seemingly similar patients having different molecular landscapes and accordingly different clinical behaviors. In recent studies, gene expression networks have been shown as more effective/informative for cancer heterogeneity analysis than some simpler measures. Gene interconnections can be classified as \"direct\" and \"indirect,\" where the latter can be caused by shared genomic regulators (such as transcription factors, microRNAs, and other regulatory molecules) and other mechanisms. It has been suggested that incorporating the regulators of gene expressions in network analysis and focusing on the direct interconnections can lead to a deeper understanding of the more essential gene interconnections. Such analysis can be seriously challenged by the large number of parameters (jointly caused by network analysis, incorporation of regulators, and heterogeneity) and often weak signals. To effectively tackle this problem, we propose incorporating prior information contained in the published literature. A key challenge is that such prior information can be partial or even wrong. We develop a two-step procedure that can flexibly accommodate different levels of prior information quality. Simulation demonstrates the effectiveness of the proposed approach and its superiority over relevant competitors. In the analysis of a breast cancer dataset, findings different from the alternatives are made, and the identified sample subgroups have important clinical differences.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141794124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Recoverability of causal effects under presence of missing data: a longitudinal case study. 数据缺失情况下因果效应的可恢复性：纵向案例研究。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae044

Anastasiia Holovchak, Helen McIlleron, Paolo Denti, Michael Schomaker

{"title":"Recoverability of causal effects under presence of missing data: a longitudinal case study.","authors":"Anastasiia Holovchak, Helen McIlleron, Paolo Denti, Michael Schomaker","doi":"10.1093/biostatistics/kxae044","DOIUrl":"10.1093/biostatistics/kxae044","url":null,"abstract":"Missing data in multiple variables is a common issue. We investigate the applicability of the framework of graphical models for handling missing data to a complex longitudinal pharmacological study of children with HIV treated with an efavirenz-based regimen as part of the CHAPAS-3 trial. Specifically, we examine whether the causal effects of interest, defined through static interventions on multiple continuous variables, can be recovered (estimated consistently) from the available data only. So far, no general algorithms are available to decide on recoverability, and decisions have to be made on a case-by-case basis. We emphasize the sensitivity of recoverability to even the smallest changes in the graph structure, and present recoverability results for three plausible missingness-directed acyclic graphs (m-DAGs) in the CHAPAS-3 study, informed by clinical knowledge. Furthermore, we propose the concept of a \"closed missingness mechanism\": if missing data are generated based on this mechanism, an available case analysis is admissible for consistent estimation of any statistical or causal estimand, even if data are missing not at random. Both simulations and theoretical considerations demonstrate how, in the assumed MNAR setting of our study, a complete or available case analysis can be superior to multiple imputation, and estimation results vary depending on the assumed missingness DAG. Our analyses demonstrate an innovative application of missingness DAGs to complex longitudinal real-world data, while highlighting the sensitivity of the results with respect to the assumed causal model.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7617375/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Random forest for dynamic risk prediction of recurrent events: a pseudo-observation approach. 随机森林用于周期性事件的动态风险预测：一种伪观测方法。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf007

Abigail Loe, Susan Murray, Zhenke Wu

{"title":"Random forest for dynamic risk prediction of recurrent events: a pseudo-observation approach.","authors":"Abigail Loe, Susan Murray, Zhenke Wu","doi":"10.1093/biostatistics/kxaf007","DOIUrl":"10.1093/biostatistics/kxaf007","url":null,"abstract":"Recurrent events are common in clinical, healthcare, social, and behavioral studies, yet methods for dynamic risk prediction of these events are limited. To overcome some long-standing challenges in analyzing censored recurrent event data, a recent regression analysis framework constructs a censored longitudinal dataset consisting of times to the first recurrent event in multiple pre-specified follow-up windows of length $ tau $(XMT models). Traditional regression models struggle with nonlinear and multiway interactions, with success depending on the skill of the statistical programmer. With a staggering number of potential predictors being generated from genetic, -omic, and electronic health records sources, machine learning approaches such as the random forest regression are growing in popularity, as they can nonparametrically incorporate information from many predictors with nonlinear and multiway interactions involved in prediction. In this article, we (i) develop a random forest approach for dynamically predicting probabilities of remaining event-free during a subsequent $ tau $-duration follow-up period from a reconstructed censored longitudinal data set, (ii) modify the XMT regression approach to predict these same probabilities, subject to the limitations that traditional regression models typically have, and (iii) demonstrate how to incorporate patient-specific history of recurrent events for prediction in settings where this information may be partially missing. We show the increased ability of our random forest algorithm for predicting the probability of remaining event-free over a $ tau $-duration follow-up window when compared to our modified XMT method for prediction in settings where association between predictors and recurrent event outcomes is complex in nature. We also show the importance of incorporating past recurrent event history in prediction algorithms when event times are correlated within a subject. The proposed random forest algorithm is demonstrated using recurrent exacerbation data from the trial of Azithromycin for the Prevention of Exacerbations of Chronic Obstructive Pulmonary Disease.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143626883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A marginal structural model for normal tissue complication probability. 正常组织并发症概率的边际结构模型。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae019

Thai-Son Tang, Zhihui Liu, Ali Hosni, John Kim, Olli Saarela

引用次数: 0

Shared parameter modeling of longitudinal data allowing for possibly informative visiting process and terminal event. 纵向数据的共享参数建模，允许可能有信息的访问过程和终端事件。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae041

Christos Thomadakis, Loukia Meligkotsidou, Nikos Pantazis, Giota Touloumi

{"title":"Shared parameter modeling of longitudinal data allowing for possibly informative visiting process and terminal event.","authors":"Christos Thomadakis, Loukia Meligkotsidou, Nikos Pantazis, Giota Touloumi","doi":"10.1093/biostatistics/kxae041","DOIUrl":"10.1093/biostatistics/kxae041","url":null,"abstract":"Joint modeling of longitudinal and time-to-event data, particularly through shared parameter models (SPMs), is a common approach for handling longitudinal marker data with an informative terminal event. A critical but often neglected assumption in this context is that the visiting/observation process is noninformative, depending solely on past marker values and visit times. When this assumption fails, the visiting process becomes informative, resulting potentially to biased SPM estimates. Existing methods generally rely on a conditional independence assumption, positing that the marker model, visiting process, and time-to-event model are independent given shared or correlated random effects. Moreover, they are typically built on an intensity-based visiting process using calendar time. This study introduces a unified approach for jointly modeling a normally distributed marker, the visiting process, and time-to-event data in the form of competing risks. Our model conditions on the history of observed marker values, prior visit times, the marker's random effects, and possibly a frailty term independent of the random effects. While our approach aligns with the shared-parameter framework, it does not presume conditional independence between the processes. Additionally, the visiting process can be defined on either a gap time scale, via proportional hazard models, or a calendar time scale, via proportional intensity models. Through extensive simulation studies, we assess the performance of our proposed methodology. We demonstrate that disregarding an informative visiting process can yield significantly biased marker estimates. However, misspecification of the visiting process can also lead to biased estimates. The gap time formulation exhibits greater robustness compared to the intensity-based model when the visiting process is misspecified. In general, enriching the visiting process with prior visit history enhances performance. We further apply our methodology to real longitudinal data from HIV, where visit frequency varies substantially among individuals.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11911807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unveiling Schizophrenia: a study with generalized functional linear mixed model via the investigation of functional random effects. 揭示精神分裂症：基于功能随机效应的广义泛函线性混合模型研究。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae049

Rongxiang Rui, Wei Xiong, Jianxin Pan, Maozai Tian

{"title":"Unveiling Schizophrenia: a study with generalized functional linear mixed model via the investigation of functional random effects.","authors":"Rongxiang Rui, Wei Xiong, Jianxin Pan, Maozai Tian","doi":"10.1093/biostatistics/kxae049","DOIUrl":"https://doi.org/10.1093/biostatistics/kxae049","url":null,"abstract":"Previous studies have identified attenuated pre-speech activity and speech sound suppression in individuals with Schizophrenia, with similar patterns observed in basic tasks entailing button-pressing to perceive a tone. However, it remains unclear whether these patterns are uniform across individuals or vary from person to person. Motivated by electroencephalographic (EEG) data from a Schizophrenia study, we develop a generalized functional linear mixed model (GFLMM) for repeated measurements by incorporating subject-specific functional random effects associated with multiple functional predictors. To assess the significance of these functional effects, we employ two different multivariate functional principal component analysis methods, which transform the GFLMM into a conventional generalized linear mixed model, thereby facilitating its implementation with standard software. Furthermore, we introduce a cutting-edge testing approach utilizing working responses to detect both subject-specific and predictor-specific functional random effects. Monte Carlo simulation studies demonstrate the effectiveness of our proposed testing method. Application of the proposed methods to the Schizophrenia data reveals significant subject-specific effects of human brain activity in the frontal zone (Fz) and the central zone (Cz), providing valuable insights into the potential variations among individuals, from healthy controls to those diagnosed with Schizophrenia.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sensitivity analysis for the probability of benefit in randomized controlled trials with a binary treatment and a binary outcome. 采用二元治疗和二元结局的随机对照试验中获益概率的敏感性分析。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf011

Iuliana Ciocănea-Teodorescu, Erin E Gabriel, Arvid Sjölander

{"title":"Sensitivity analysis for the probability of benefit in randomized controlled trials with a binary treatment and a binary outcome.","authors":"Iuliana Ciocănea-Teodorescu, Erin E Gabriel, Arvid Sjölander","doi":"10.1093/biostatistics/kxaf011","DOIUrl":"10.1093/biostatistics/kxaf011","url":null,"abstract":"For a comprehensive understanding of the effect of a given treatment on an outcome of interest, quantification of individual treatment heterogeneity is essential, alongside estimation of the average causal effect. However, even in randomized controlled trials, quantities such as the probability of benefit or the probability of harm are not identifiable, since multiple potential outcomes cannot be observed simultaneously for the same individual. We propose a sensitivity analysis for the probability of benefit in randomized controlled trial settings with a binary treatment and a binary outcome, by quantifying the deviation from conditional independence of the two potential outcomes, given a set of measured prognostic baseline covariates. We do this using a marginal sensitivity analysis parameter that does not depend on the number or complexity of the measured covariates. We provide a guide to estimation and interpretation, and illustrate our method in simulations, as well as using a real data example from a randomized controlled trial studying the effect of umbilical vein oxytocin administration on the need for manual removal of the placenta during birth.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12129078/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bipartite interference and air pollution transport: estimating health effects of power plant interventions. 三方干扰与空气污染运输：电厂干预对健康影响的估计。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae051

Corwin Zigler, Vera Liu, Fabrizia Mealli, Laura Forastiere

{"title":"Bipartite interference and air pollution transport: estimating health effects of power plant interventions.","authors":"Corwin Zigler, Vera Liu, Fabrizia Mealli, Laura Forastiere","doi":"10.1093/biostatistics/kxae051","DOIUrl":"10.1093/biostatistics/kxae051","url":null,"abstract":"Evaluating air quality interventions is confronted with the challenge of interference since interventions at a particular pollution source likely impact air quality and health at distant locations, and air quality and health at any given location are likely impacted by interventions at many sources. The structure of interference in this context is dictated by complex atmospheric processes governing how pollution emitted from a particular source is transformed and transported across space and can be cast with a bipartite structure reflecting the two distinct types of units: (i) interventional units on which treatments are applied or withheld to change pollution emissions; and (ii) outcome units on which outcomes of primary interest are measured. We propose new estimands for bipartite causal inference with interference that construe two components of treatment: a \"key-associated\" (or \"individual\") treatment and an \"upwind\" (or \"neighborhood\") treatment. Estimation is carried out using a covariate adjustment approach based on a joint propensity score. A reduced-complexity atmospheric model characterizes the structure of the interference network by modeling the movement of air parcels through time and space. The new methods are deployed to evaluate the effectiveness of installing flue-gas desulfurization scrubbers on 472 coal-burning power plants (the interventional units) in reducing Medicare hospitalizations among 21,577,552 Medicare beneficiaries residing across 25,553 ZIP codes in the United States (the outcome units).","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823286/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143048850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimation and inference for causal spillover effects in egocentric-network randomized trials in the presence of network membership misclassification. 存在网络成员错误分类的自我中心网络随机试验中因果溢出效应的估计与推断。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf009

Ariel Chao, Donna Spiegelman, Ashley Buchanan, Laura Forastiere

{"title":"Estimation and inference for causal spillover effects in egocentric-network randomized trials in the presence of network membership misclassification.","authors":"Ariel Chao, Donna Spiegelman, Ashley Buchanan, Laura Forastiere","doi":"10.1093/biostatistics/kxaf009","DOIUrl":"10.1093/biostatistics/kxaf009","url":null,"abstract":"To leverage peer influence and increase population behavioral changes, behavioral interventions often rely on peer-based strategies. A common study design that assesses such strategies is the egocentric-network randomized trial (ENRT), where index participants receive a behavioral training and are encouraged to disseminate information to their peers. Under this design, a crucial estimand of interest is the Average Spillover Effect (ASpE), which measures the impact of the intervention on participants who do not receive it, but whose outcomes may be affected by others who do. The assessment of the ASpE relies on assumptions about, and correct measurement of, interference sets within which individuals may influence one another's outcomes. It can be challenging to properly specify interference sets, such as networks in ENRTs, and when mismeasured, intervention effects estimated by existing methods will be biased. In studies where social networks play an important role in disease transmission or behavior change, correcting ASpE estimates for bias due to network misclassification is critical for accurately evaluating the full impact of interventions. We combined measurement error and causal inference methods to bias-correct the ASpE estimate for network misclassification in ENRTs, when surrogate networks are recorded in place of true ones, and validation data that relate the misclassified to the true networks are available. We investigated finite sample properties of our methods in an extensive simulation study and illustrated our methods in the HIV Prevention Trials Network (HPTN) 037 study.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11955068/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143755648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding the opioid syndemic in North Carolina: A novel approach to modeling and identifying factors. 了解北卡罗莱纳州的阿片类药物综合征：一种建模和识别因素的新方法。

IF 1.8 3区数学

Biostatistics Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae052

Eva Murphy, David Kline, Kathleen L Egan, Kathryn E Lancaster, William C Miller, Lance A Waller, Staci A Hepler

{"title":"Understanding the opioid syndemic in North Carolina: A novel approach to modeling and identifying factors.","authors":"Eva Murphy, David Kline, Kathleen L Egan, Kathryn E Lancaster, William C Miller, Lance A Waller, Staci A Hepler","doi":"10.1093/biostatistics/kxae052","DOIUrl":"10.1093/biostatistics/kxae052","url":null,"abstract":"The opioid epidemic is a significant public health challenge in North Carolina, but limited data restrict our understanding of its complexity. Examining trends and relationships among different outcomes believed to reflect opioid misuse provides an alternative perspective to understand the opioid epidemic. We use a Bayesian dynamic spatial factor model to capture the interrelated dynamics within six different county-level outcomes, such as illicit opioid overdose deaths, emergency department visits related to drug overdose, treatment counts for opioid use disorder, patients receiving prescriptions for buprenorphine, and newly diagnosed cases of acute and chronic hepatitis C virus and human immunodeficiency virus. We design the factor model to yield meaningful interactions among predefined subsets of these outcomes, causing a departure from the conventional lower triangular structure in the loadings matrix and leading to familiar identifiability issues. To address this challenge, we propose a novel approach that involves decomposing the loadings matrix within a Markov chain Monte Carlo algorithm, allowing us to estimate the loadings and factors uniquely. As a result, we gain a better understanding of the spatio-temporal dynamics of the opioid epidemic in North Carolina.","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143048855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0