Hisashi Noma, Shonosuke Sugasawa, Toshi A Furukawa
{"title":"Robust inference methods for meta-analysis involving influential outlying studies.","authors":"Hisashi Noma, Shonosuke Sugasawa, Toshi A Furukawa","doi":"10.1002/sim.10157","DOIUrl":"10.1002/sim.10157","url":null,"abstract":"<p><p>Meta-analysis is an essential tool to comprehensively synthesize and quantitatively evaluate results of multiple clinical studies in evidence-based medicine. In many meta-analyses, the characteristics of some studies might markedly differ from those of the others, and these outlying studies can generate biases and potentially yield misleading results. In this article, we provide effective robust statistical inference methods using generalized likelihoods based on the density power divergence. The robust inference methods are designed to adjust the influences of outliers through the use of modified estimating equations based on a robust criterion, even when multiple and serious influential outliers are present. We provide the robust estimators, statistical tests, and confidence intervals via the generalized likelihoods for the fixed-effect and random-effects models of meta-analysis. We also assess the contribution rates of individual studies to the robust overall estimators that indicate how the influences of outlying studies are adjusted. Through simulations and applications to two recently published systematic reviews, we demonstrate that the overall conclusions and interpretations of meta-analyses can be markedly changed if the robust inference methods are applied and that only the conventional inference methods might produce misleading evidence. These methods would be recommended to be used at least as a sensitivity analysis method in the practice of meta-analysis. We have also developed an R package, robustmeta, that implements the robust inference methods.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141427665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiong Wu, Yuan Zhang, Xiaoqi Huang, Tianzhou Ma, L Elliot Hong, Peter Kochunov, Shuo Chen
{"title":"A multivariate to multivariate approach for voxel-wise genome-wide association analysis.","authors":"Qiong Wu, Yuan Zhang, Xiaoqi Huang, Tianzhou Ma, L Elliot Hong, Peter Kochunov, Shuo Chen","doi":"10.1002/sim.10101","DOIUrl":"10.1002/sim.10101","url":null,"abstract":"<p><p>The joint analysis of imaging-genetics data facilitates the systematic investigation of genetic effects on brain structures and functions with spatial specificity. We focus on voxel-wise genome-wide association analysis, which may involve trillions of single nucleotide polymorphism (SNP)-voxel pairs. We attempt to identify underlying organized association patterns of SNP-voxel pairs and understand the polygenic and pleiotropic networks on brain imaging traits. We propose a bi-clique graph structure (ie, a set of SNPs highly correlated with a cluster of voxels) for the systematic association pattern. Next, we develop computational strategies to detect latent SNP-voxel bi-cliques and an inference model for statistical testing. We further provide theoretical results to guarantee the accuracy of our computational algorithms and statistical inference. We validate our method by extensive simulation studies, and then apply it to the whole genome genetic and voxel-level white matter integrity data collected from 1052 participants of the human connectome project. The results demonstrate multiple genetic loci influencing white matter integrity measures on splenium and genu of the corpus callosum.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141459447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacky C Kuo, Wenyaw Chan, Luis Leon-Novelo, David R Lairson, Armand Brown, Kayo Fujimoto
{"title":"Latent classification model for censored longitudinal binary outcome.","authors":"Jacky C Kuo, Wenyaw Chan, Luis Leon-Novelo, David R Lairson, Armand Brown, Kayo Fujimoto","doi":"10.1002/sim.10156","DOIUrl":"10.1002/sim.10156","url":null,"abstract":"<p><p>Latent classification model is a class of statistical methods for identifying unobserved class membership among the study samples using some observed data. In this study, we proposed a latent classification model that takes a censored longitudinal binary outcome variable and uses its changing pattern over time to predict individuals' latent class membership. Assuming the time-dependent outcome variables follow a continuous-time Markov chain, the proposed method has two primary goals: (1) estimate the distribution of the latent classes and predict individuals' class membership, and (2) estimate the class-specific transition rates and rate ratios. To assess the model's performance, we conducted a simulation study and verified that our algorithm produces accurate model estimates (ie, small bias) with reasonable confidence intervals (ie, achieving approximately 95% coverage probability). Furthermore, we compared our model to four other existing latent class models and demonstrated that our approach yields higher prediction accuracies for latent classes. We applied our proposed method to analyze the COVID-19 data in Houston, Texas, US collected between January first 2021 and December 31st 2021. Early reports on the COVID-19 pandemic showed that the severity of a SARS-CoV-2 infection tends to vary greatly by cases. We found that while demographic characteristics explain some of the differences in individuals' experience with COVID-19, some unaccounted-for latent variables were associated with the disease.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141477461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of weighted exponential random graph models frameworks applied to neuroimaging.","authors":"Yefeng Fan, Simon R White","doi":"10.1002/sim.10162","DOIUrl":"10.1002/sim.10162","url":null,"abstract":"<p><p>Neuro-imaging data can often be represented as statistical networks, especially for functional magnetic resonance imaging (fMRI) data, where brain regions are defined as nodes and the functional interactions between those regions are taken as edges. Such networks are commonly divided into classes depending on the type of edges, namely binary or weighted. A binary network means edges can either be present or absent. Whereas the edges of a weighted network are associated with weight values, and fMRI networks belong to weighted networks. Statistical methods are often adopted to analyse such networks, among which, the exponential random graph model (ERGM) is an important network analysis approach. Typically ERGMs are applied to binary networks, and weighted networks often need to be binarised by arbitrarily selecting a threshold value to define the presence of the edges, which can lead to non-robustness and loss of valuable edge weight information representing the strength of fMRI interaction in fMRI networks. While it is therefore important to gain deeper insight in adopting ERGM on weighted networks, there only exists a few different ERGM frameworks for weighted networks; some of these are not directly implementable on fMRI networks based on their original proposal. We systematically review, implement, analyse and compare five such frameworks via a simulation study and provide guidelines on each modelling framework as well as conclude the suitability of them on fMRI networks based on a range of criteria. We concluded that Multi-Layered ERGM is currently the most suitable framework.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141459453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Propensity score weighted multi-source exchangeability models for incorporating external control data in randomized clinical trials.","authors":"Wei Wei, Yunxuan Zhang, Satrajit Roychoudhury","doi":"10.1002/sim.10158","DOIUrl":"10.1002/sim.10158","url":null,"abstract":"<p><p>Among clinical trialists, there has been a growing interest in using external data to improve decision-making and accelerate drug development in randomized clinical trials (RCTs). Here we propose a novel approach that combines the propensity score weighting (PW) and the multi-source exchangeability modelling (MEM) approaches to augment the control arm of a RCT in the rare disease setting. First, propensity score weighting is used to construct weighted external controls that have similar observed pre-treatment characteristics as the current trial population. Next, the MEM approach evaluates the similarity in outcome distributions between the weighted external controls and the concurrent control arm. The amount of external data we borrow is determined by the similarities in pretreatment characteristics and outcome distributions. The proposed approach can be applied to binary, continuous and count data. We evaluate the performance of the proposed PW-MEM method and several competing approaches based on simulation and re-sampling studies. Our results show that the PW-MEM approach improves the precision of treatment effect estimates while reducing the biases associated with borrowing data from external sources.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141459452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kaiqiong Zhao, Karim Oualkacha, Yixiao Zeng, Cathy Shen, Kathleen Klein, Lajmi Lakhal-Chaieb, Aurélie Labbe, Tomi Pastinen, Marie Hudson, Inés Colmegna, Sasha Bernatsky, Celia M T Greenwood
{"title":"Addressing dispersion in mis-measured multivariate binomial outcomes: A novel statistical approach for detecting differentially methylated regions in bisulfite sequencing data.","authors":"Kaiqiong Zhao, Karim Oualkacha, Yixiao Zeng, Cathy Shen, Kathleen Klein, Lajmi Lakhal-Chaieb, Aurélie Labbe, Tomi Pastinen, Marie Hudson, Inés Colmegna, Sasha Bernatsky, Celia M T Greenwood","doi":"10.1002/sim.10149","DOIUrl":"10.1002/sim.10149","url":null,"abstract":"<p><p>Motivated by a DNA methylation application, this article addresses the problem of fitting and inferring a multivariate binomial regression model for outcomes that are contaminated by errors and exhibit extra-parametric variations, also known as dispersion. While dispersion in univariate binomial regression has been extensively studied, addressing dispersion in the context of multivariate outcomes remains a complex and relatively unexplored task. The complexity arises from a noteworthy data characteristic observed in our motivating dataset: non-constant yet correlated dispersion across outcomes. To address this challenge and account for possible measurement error, we propose a novel hierarchical quasi-binomial varying coefficient mixed model, which enables flexible dispersion patterns through a combination of additive and multiplicative dispersion components. To maximize the Laplace-approximated quasi-likelihood of our model, we further develop a specialized two-stage expectation-maximization (EM) algorithm, where a plug-in estimate for the multiplicative scale parameter enhances the speed and stability of the EM iterations. Simulations demonstrated that our approach yields accurate inference for smooth covariate effects and exhibits excellent power in detecting non-zero effects. Additionally, we applied our proposed method to investigate the association between DNA methylation, measured across the genome through targeted custom capture sequencing of whole blood, and levels of anti-citrullinated protein antibodies (ACPA), a preclinical marker for rheumatoid arthritis (RA) risk. Our analysis revealed 23 significant genes that potentially contribute to ACPA-related differential methylation, highlighting the relevance of cell signaling and collagen metabolism in RA. We implemented our method in the R Bioconductor package called \"SOMNiBUS.\"</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141459448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiadong Ji, Zhendong Hou, Yong He, Lei Liu, Fuzhong Xue, Hao Chen, Zhongshang Yuan
{"title":"Differential network knockoff filter with application to brain connectivity analysis.","authors":"Jiadong Ji, Zhendong Hou, Yong He, Lei Liu, Fuzhong Xue, Hao Chen, Zhongshang Yuan","doi":"10.1002/sim.10155","DOIUrl":"10.1002/sim.10155","url":null,"abstract":"<p><p>The brain functional connectivity can typically be represented as a brain functional network, where nodes represent regions of interest (ROIs) and edges symbolize their connections. Studying group differences in brain functional connectivity can help identify brain regions and recover the brain functional network linked to neurodegenerative diseases. This process, known as differential network analysis focuses on the differences between estimated precision matrices for two groups. Current methods struggle with individual heterogeneity in measuring the brain connectivity, false discovery rate (FDR) control, and accounting for confounding factors, resulting in biased estimates and diminished power. To address these issues, we present a two-stage FDR-controlled feature selection method for differential network analysis using functional magnetic resonance imaging (fMRI) data. First, we create individual brain connectivity measures using a high-dimensional precision matrix estimation technique. Next, we devise a penalized logistic regression model that employs individual brain connectivity data and integrates a new knockoff filter for FDR control when detecting significant differential edges. Through extensive simulations, we showcase the superiority of our approach compared to other methods. Additionally, we apply our technique to fMRI data to identify differential edges between Alzheimer's disease and control groups. Our results are consistent with prior experimental studies, emphasizing the practical applicability of our method.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141459450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zayd Omar, David A Stephens, Alexandra M Schmidt, David L Buckeridge
{"title":"A Bayesian non-stationary heteroskedastic time series model for multivariate critical care data.","authors":"Zayd Omar, David A Stephens, Alexandra M Schmidt, David L Buckeridge","doi":"10.1002/sim.10154","DOIUrl":"10.1002/sim.10154","url":null,"abstract":"<p><p>We propose a multivariate GARCH model for non-stationary health time series by modifying the observation-level variance of the standard state space model. The proposed model provides an intuitive and novel way of dealing with heteroskedastic data using the conditional nature of state-space models. We follow the Bayesian paradigm to perform the inference procedure. In particular, we use Markov chain Monte Carlo methods to obtain samples from the resultant posterior distribution. We use the forward filtering backward sampling algorithm to efficiently obtain samples from the posterior distribution of the latent state. The proposed model also handles missing data in a fully Bayesian fashion. We validate our model on synthetic data and analyze a data set obtained from an intensive care unit in a Montreal hospital and the MIMIC dataset. We further show that our proposed models offer better performance, in terms of WAIC than standard state space models. The proposed model provides a new way to model multivariate heteroskedastic non-stationary time series data. Model comparison can then be easily performed using the WAIC.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141493446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mediation Analysis with Multiple Exposures and Multiple Mediators","authors":"Yi Zhao","doi":"10.1002/sim.10215","DOIUrl":"https://doi.org/10.1002/sim.10215","url":null,"abstract":"A mediation analysis approach is proposed for multiple exposures, multiple mediators, and a continuous scalar outcome under the linear structural equation modeling framework. It assumes that there exist orthogonal components that demonstrate parallel mediation mechanisms on the outcome, and thus is named principal component mediation analysis (PCMA). Likelihood‐based estimators are introduced for simultaneous estimation of the component projections and effect parameters. The asymptotic distribution of the estimators is derived for low‐dimensional data. A bootstrap procedure is introduced for inference. Simulation studies illustrate the superior performance of the proposed approach. Applied to a proteomics‐imaging dataset from the Alzheimer's disease neuroimaging initiative (ADNI), the proposed framework identifies protein deposition – brain atrophy – memory deficit mechanisms consistent with existing knowledge and suggests potential AD pathology by integrating data collected from different modalities.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Corentin Ségalas, Catherine Helmer, Robin Genuer, Cécile Proust-Lima
{"title":"Functional Principal Component Analysis as an Alternative to Mixed-Effect Models for Describing Sparse Repeated Measures in Presence of Missing Data.","authors":"Corentin Ségalas, Catherine Helmer, Robin Genuer, Cécile Proust-Lima","doi":"10.1002/sim.10214","DOIUrl":"https://doi.org/10.1002/sim.10214","url":null,"abstract":"<p><p>Analyzing longitudinal data in health studies is challenging due to sparse and error-prone measurements, strong within-individual correlation, missing data and various trajectory shapes. While mixed-effect models (MM) effectively address these challenges, they remain parametric models and may incur computational costs. In contrast, functional principal component analysis (FPCA) is a non-parametric approach developed for regular and dense functional data that flexibly describes temporal trajectories at a potentially lower computational cost. This article presents an empirical simulation study evaluating the behavior of FPCA with sparse and error-prone repeated measures and its robustness under different missing data schemes in comparison with MM. The results show that FPCA is well-suited in the presence of missing at random data caused by dropout, except in scenarios involving most frequent and systematic dropout. Like MM, FPCA fails under missing not at random mechanism. The FPCA was applied to describe the trajectories of four cognitive functions before clinical dementia and contrast them with those of matched controls in a case-control study nested in a population-based aging cohort. The average cognitive declines of future dementia cases showed a sudden divergence from those of their matched controls with a sharp acceleration 5 to 2.5 years prior to diagnosis.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142154969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}