BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujaf009
Abdollah Jalilian, Francisco Cuevas-Pacheco, Ganggang Xu, Rasmus Waagepetersen
{"title":"Composite likelihood inference for space-time point processes.","authors":"Abdollah Jalilian, Francisco Cuevas-Pacheco, Ganggang Xu, Rasmus Waagepetersen","doi":"10.1093/biomtc/ujaf009","DOIUrl":"10.1093/biomtc/ujaf009","url":null,"abstract":"<p><p>The dynamics of a rain forest is extremely complex involving births, deaths, and growth of trees with complex interactions between trees, animals, climate, and environment. We consider the patterns of recruits (new trees) and dead trees between rain forest censuses. For a current census, we specify regression models for the conditional intensity of recruits and the conditional probabilities of death given the current trees and spatial covariates. We estimate regression parameters using conditional composite likelihood functions that only involve the conditional first order properties of the data. When constructing assumption lean estimators of covariance matrices of parameter estimates, we only need mild assumptions of decaying conditional correlations in space, while assumptions regarding correlations over time are avoided by exploiting conditional centering of composite likelihood score functions. Time series of point patterns from rain forest censuses are quite short, while each point pattern covers a fairly big spatial region. To obtain asymptotic results, we therefore use a central limit theorem for the fixed timespan-increasing spatial domain asymptotic setting. This also allows us to handle the challenge of using stochastic covariates constructed from past point patterns. Conveniently, it suffices to impose weak dependence assumptions on the innovations of the space-time process. We investigate the proposed methodology by simulation studies and an application to rain forest data.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143405365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujaf026
Florian Stijven, Trung Dung Tran, Ellen Driessen, Ariel Alonso Abad, Geert Molenberghs, Geert Verbeke, Iven Van Mechelen
{"title":"Optimal treatment regime estimation in practice: challenges and choices in a randomized clinical trial for depression.","authors":"Florian Stijven, Trung Dung Tran, Ellen Driessen, Ariel Alonso Abad, Geert Molenberghs, Geert Verbeke, Iven Van Mechelen","doi":"10.1093/biomtc/ujaf026","DOIUrl":"10.1093/biomtc/ujaf026","url":null,"abstract":"<p><p>An important aspect of precision medicine is the tailoring of treatments to specific patient types. Nowadays, various methods are available to estimate for this purpose so-called optimal treatment regimes, that is, decision rules for treatment assignment that map patterns of pretreatment characteristics to treatment alternatives and that maximize the expected patient benefit. However, the application of these methods to real-life data has been limited and comes with nonstandard statistical issues. In search of best practices, we reanalyzed data from a randomized clinical trial for the treatment of dysthymic disorder. While the original objective of this trial was to detect a marginally best treatment alternative, we wanted to estimate an optimal treatment regime using 2 prominent estimation methods: Q-learning and value search estimation. An important obstacle in the dataset under study was the occurrence of missing values. This was handled with multiple imputation, a thoughtful implementation of which, however, implied several challenges. Other challenges were implied by the concrete implementation of value search estimation. In this paper, all the choices we have made in the analysis to handle the aforementioned issues are detailed together with a motivation and a description of possible alternatives. Accordingly, this paper may serve as a guide to apply optimal treatment regime estimation in data-analytic practice.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujaf014
Tianyu Zhan, Jian Kang
{"title":"A general, flexible, and harmonious framework to construct interpretable functions in regression analysis.","authors":"Tianyu Zhan, Jian Kang","doi":"10.1093/biomtc/ujaf014","DOIUrl":"10.1093/biomtc/ujaf014","url":null,"abstract":"<p><p>An interpretable model or method has several appealing features, such as reliability to adversarial examples, transparency of decision-making, and communication facilitator. However, interpretability is a subjective concept, and even its definition can be diverse. The same model may be deemed as interpretable by a study team, but regarded as a black-box algorithm by another squad. Simplicity, accuracy and generalizability are some additional important aspects of evaluating interpretability. In this work, we present a general, flexible and harmonious framework to construct interpretable functions in regression analysis with a focus on continuous outcomes. We formulate a functional skeleton in light of users' expectations of interpretability. A new measure based on Mallows's $C_p$-statistic is proposed for model selection to balance approximation, generalizability, and interpretability. We apply this approach to derive a sample size formula in adaptive clinical trial designs to demonstrate the general workflow, and to explain operating characteristics in a Bayesian Go/No-Go paradigm to show the potential advantages of using meaningful intermediate variables. Generalization to categorical outcomes is illustrated in an example of hypothesis testing based on Fisher's exact test. A real data analysis of NHANES (National Health and Nutrition Examination Survey) is conducted to investigate relationships between some important laboratory measurements. We also discuss some extensions of this method.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143555802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae156
Wancen Mu, Jiawen Chen, Eric S Davis, Kathleen Reed, Douglas Phanstiel, Michael I Love, Didong Li
{"title":"Gaussian processes for time series with lead-lag effects with applications to biology data.","authors":"Wancen Mu, Jiawen Chen, Eric S Davis, Kathleen Reed, Douglas Phanstiel, Michael I Love, Didong Li","doi":"10.1093/biomtc/ujae156","DOIUrl":"10.1093/biomtc/ujae156","url":null,"abstract":"<p><p>Investigating the relationship, particularly the lead-lag effect, between time series is a common question across various disciplines, especially when uncovering biological processes. However, analyzing time series presents several challenges. Firstly, due to technical reasons, the time points at which observations are made are not at uniform intervals. Secondly, some lead-lag effects are transient, necessitating time-lag estimation based on a limited number of time points. Thirdly, external factors also impact these time series, requiring a similarity metric to assess the lead-lag relationship. To counter these issues, we introduce a model grounded in the Gaussian process, affording the flexibility to estimate lead-lag effects for irregular time series. In addition, our method outputs dissimilarity scores, thereby broadening its applications to include tasks such as ranking or clustering multiple pairwise time series when considering their strength of lead-lag effects with external factors. Crucially, we offer a series of theoretical proofs to substantiate the validity of our proposed kernels and the identifiability of kernel parameters. Our model demonstrates advances in various simulations and real-world applications, particularly in the study of dynamic chromatin interactions, compared to other leading methods.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11704948/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating the effects of high-throughput structural neuroimaging predictors on whole-brain functional connectome outcomes via network-based matrix-on-vector regression.","authors":"Tong Lu, Yuan Zhang, Vince Lyzinski, Chuan Bi, Peter Kochunov, Elliot Hong, Shuo Chen","doi":"10.1093/biomtc/ujaf027","DOIUrl":"10.1093/biomtc/ujaf027","url":null,"abstract":"<p><p>The joint analysis of multimodal neuroimaging data is vital in brain research, revealing complex interactions between brain structures and functions. Our study is motivated by the analysis of a vast dataset of brain functional connectivity (FC) and multimodal structural imaging (SI) features from the UK Biobank. Specifically, we aim to investigate the effects of SI features, such as white matter microstructure integrity (WMMI) and cortical thickness, on the whole-brain functional connectome network. This analysis is inherently challenging due to the extensive structural-functional associations and the intricate network patterns present in multimodal high-dimensional neuroimaging data. To bridge methodological gaps, we developed a novel multi-level sub-graph extraction method (dense bipartite with nested unipartite graph) within a matrix(network)-on-vector regression model. This method identifies subsets of spatially specific SI features that intensely and systematically influence FC sub-networks, while effectively suppressing false positives in large-scale datasets. Applying our method to a multimodal neuroimaging dataset of 4242 participants ffrom the UK Biobank, we evaluated the effects of whole-brain WMMI and cortical thickness on resting-state FC. Our findings indicate that the WMMI in corticospinal tracts and inferior cerebellar peduncle significantly affect functional connections of sensorimotor, salience, and executive sub-networks, with an average correlation of 0.81 ($p < 0.001$).</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11926586/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143673278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujaf007
Bing Tian, Jian Kang, Wei Zhong
{"title":"Feature screening for metric space-valued responses based on Fréchet regression with its applications.","authors":"Bing Tian, Jian Kang, Wei Zhong","doi":"10.1093/biomtc/ujaf007","DOIUrl":"10.1093/biomtc/ujaf007","url":null,"abstract":"<p><p>In various applications, we need to handle more general types of responses, such as distributional data and matrix-valued data, rather than a scalar variable. When the dimension of predictors is ultrahigh, it is necessarily important to identify the relevant predictors for such complex types of responses. For example, in our Alzheimer's disease neuroimaging study, we need to select the relevant single nucleotide polymorphisms out of 582 591 candidates for the distribution of voxel-level intensities in each of 42 brain regions. To this end, we propose a new sure independence screening (SIS) procedure for general metric space-valued responses based on global Fréchet regression, termed as Fréchet-SIS. The marginal general residual sum of squares is utilized to serve as a marginal utility for evaluating the importance of predictors, where only a distance between data objects is needed. We theoretically show that the proposed Fréchet-SIS procedure enjoys the sure screening property under mild regularity conditions. Monte Carlo simulations are conducted to demonstrate its excellent finite-sample performance. In Alzheimer's disease neuroimaging study, we identify important genes that correlate with brain activity across different stages of the disease and brain regions. In addition, we also include an economic case study to illustrate our proposal.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143397821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujaf010
Yichen Lou, Yuqing Ma, Jianguo Sun, Peijie Wang, Zhisheng Ye
{"title":"Instrumental variable estimation of complier casual treatment effects with interval-censored competing risks data.","authors":"Yichen Lou, Yuqing Ma, Jianguo Sun, Peijie Wang, Zhisheng Ye","doi":"10.1093/biomtc/ujaf010","DOIUrl":"10.1093/biomtc/ujaf010","url":null,"abstract":"<p><p>This paper discusses the assessment of causal treatment effects on a time-to-event outcome, a crucial part of many scientific investigations. Although some methods have been developed for the problem, they are not applicable to situations where there exist both interval censoring and competing risks. We fill in this critical gap under a class of transformation models for cumulative incidence functions by developing an instrumented variable (IV) estimation approach. The IV is a valuable tool commonly used to mitigate the impact of endogenous treatment selection and to determine causal treatment effects in an unbiased manner. The proposed method is flexible as the model includes many commonly used models such as the sub-distributional proportional odds and hazards models (ie, the Fine-Gray model) as special cases. The resulting estimator for the regression parameter is shown to be consistent and asymptotically normal. A simulation study is conducted to evaluate finite sample performance of the proposed approach and suggests that it works well in practice. It is applied to a breast cancer screening study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143432303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujaf028
Christopher S McMahan, Chase N Joyner, Joshua M Tebbs, Christopher R Bilder
{"title":"A mixed-effects Bayesian regression model for multivariate group testing data.","authors":"Christopher S McMahan, Chase N Joyner, Joshua M Tebbs, Christopher R Bilder","doi":"10.1093/biomtc/ujaf028","DOIUrl":"10.1093/biomtc/ujaf028","url":null,"abstract":"<p><p>Laboratories use group (pooled) testing with multiplex assays to reduce the time and cost associated with screening large populations for infectious diseases. Multiplex assays test for multiple diseases simultaneously, and combining their use with group testing can lead to highly efficient screening protocols. However, these benefits come at the expense of a more complex data structure which can hinder surveillance efforts. To overcome this challenge, we develop a general Bayesian framework to estimate a mixed multivariate probit model with data arising from any group testing protocol that uses multiplex assays. In the formulation of this model, we account for the correlation between true disease statuses and heterogeneity across population subgroups, and we provide for automated variable selection through the adoption of spike and slab priors. To perform model fitting, we develop an attractive posterior sampling algorithm which is straightforward to implement. We illustrate our methodology through numerical studies and analyze chlamydia and gonorrhea group testing data collected by the State Hygienic Laboratory at the University of Iowa.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11926587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143673245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujae168
Daxuan Deng, Lijun Zhang, Hao Feng, Vernon M Chinchilli, Chixiang Chen, Ming Wang
{"title":"Improving estimation efficiency for survival data analysis by integrating a coarsened time-to-event outcome from an external study.","authors":"Daxuan Deng, Lijun Zhang, Hao Feng, Vernon M Chinchilli, Chixiang Chen, Ming Wang","doi":"10.1093/biomtc/ujae168","DOIUrl":"10.1093/biomtc/ujae168","url":null,"abstract":"<p><p>In the era of big data, increasing availability of data makes combining different data sources to obtain more accurate estimations a popular topic. However, the development of data integration is often hindered by the heterogeneity in data forms across studies. In this paper, we focus on a case in survival analysis where we have primary study data with a continuous time-to-event outcome and complete covariate measurements, while the data from an external study contain an outcome observed at regular intervals, and only a subset of covariates is measured. To incorporate external information while accounting for the different data forms, we posit working models and obtain informative weights by empirical likelihood, which will be used to construct a weighted estimator in the main analysis. We have established the theory demonstrating that the new estimator has higher estimation efficiency compared to the conventional ones, and this advantage is robust to working model misspecification, as confirmed in our simulation studies. To assess its utility, we apply our method to accommodate data from the National Alzheimer's Coordinating Center to improve the analysis of the Alzheimer's Disease Neuroimaging Initiative Phase 1 study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11747882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142999230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-01-07DOI: 10.1093/biomtc/ujaf006
Yael Travis-Lumer, Micha Mandel, Rebecca A Betensky
{"title":"Pseudo-observations for bivariate survival data.","authors":"Yael Travis-Lumer, Micha Mandel, Rebecca A Betensky","doi":"10.1093/biomtc/ujaf006","DOIUrl":"10.1093/biomtc/ujaf006","url":null,"abstract":"<p><p>The pseudo-observations approach has been gaining popularity as a method to estimate covariate effects on censored survival data. It is used regularly to estimate covariate effects on quantities such as survival probabilities, restricted mean life, cumulative incidence, and others. In this work, we propose to generalize the pseudo-observations approach to situations where a bivariate failure-time variable is observed, subject to right censoring. The idea is to first estimate the joint survival function of both failure times and then use it to define the relevant pseudo-observations. Once the pseudo-observations are calculated, they are used as the response in a generalized linear model. We consider 2 common nonparametric estimators of the joint survival function: the estimator of Lin and Ying (1993) and the Dabrowska estimator (Dabrowska, 1988). For both estimators, we show that our bivariate pseudo-observations approach produces regression estimates that are consistent and asymptotically normal. Our proposed method enables estimation of covariate effects on quantities such as the joint survival probability at a fixed bivariate time point or simultaneously at several time points and, consequentially, can estimate covariate-adjusted conditional survival probabilities. We demonstrate the method using simulations and an analysis of 2 real-world datasets.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143188046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}