BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf162
Zijun Gao, Trevor Hastie
{"title":"Estimating heterogeneous treatment effects for general responses.","authors":"Zijun Gao, Trevor Hastie","doi":"10.1093/biomtc/ujaf162","DOIUrl":"10.1093/biomtc/ujaf162","url":null,"abstract":"<p><p>Heterogeneous treatment effect models allow us to compare treatments at subgroup levels and are becoming increasingly popular in applications such as personalized medicine, advertising, and education. Regardless of the type of responses (continuous, binary, count, survival), most causal estimands focus on the differences between the treatment and control conditional means. In this paper, we propose an alternative estimand, DINA-the DIfference in NAtural parameters-to quantify heterogeneous treatment effects motivated by exponential families and the Cox model. Despite the type of responses, DINA is both convenient and more practical for modeling the influence of covariates on the treatment effect. Additionally, we introduce a meta-algorithm for DINA, enabling practitioners to utilize powerful off-the-shelf machine learning tools for the estimation of nuisance functions. This meta-algorithm is also statistically robust to errors in the nuisance function estimation. We demonstrate the efficacy of our method in combination with various machine learning base-learners on both simulated and real datasets.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12728347/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145817568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf155
Haolin Li, Haibo Zhou, David Couper, Jianwen Cai
{"title":"Super learner for survival prediction in case-cohort and generalized case-cohort studies.","authors":"Haolin Li, Haibo Zhou, David Couper, Jianwen Cai","doi":"10.1093/biomtc/ujaf155","DOIUrl":"10.1093/biomtc/ujaf155","url":null,"abstract":"<p><p>The case-cohort study design is often used in modern epidemiological studies of rare diseases, as it can achieve similar efficiency as a much larger cohort study with a fraction of the cost. Previous work focused on parameter estimation for case-cohort studies based on a particular statistical model, but few discussed the survival prediction problem under such type of design. In this article, we propose a super learner algorithm for survival prediction in case-cohort studies. We further extend our proposed algorithm to generalized case-cohort studies. The proposed super learner algorithm is shown to have asymptotic model selection consistency as well as uniform consistency. We also demonstrate our algorithm has satisfactory finite sample performances. Simulation studies suggest that the proposed super learners trained by data from case-cohort and generalized case-cohort studies have better prediction accuracy than the ones trained by data from the simple random sampling design with the same sample sizes. Finally, we apply the proposed method to analyze a generalized case-cohort study conducted as part of the Atherosclerosis Risk in Communities Study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf157
Weidong Ma, Jordana B Cohen, Jinbo Chen
{"title":"A semiparametric method for addressing underdiagnosis using electronic health record data.","authors":"Weidong Ma, Jordana B Cohen, Jinbo Chen","doi":"10.1093/biomtc/ujaf157","DOIUrl":"10.1093/biomtc/ujaf157","url":null,"abstract":"<p><p>Effective treatment of medical conditions begins with an accurate diagnosis. However, many conditions are often underdiagnosed, either being overlooked or diagnosed after significant delays. Electronic health records (EHRs) contain extensive patient health information, offering an opportunity to probabilistically identify underdiagnosed individuals. The rationale is that both diagnosed and underdiagnosed patients may display similar health profiles in EHR data, distinguishing them from condition-free patients. Thus, EHR data can be leveraged to develop models that assess an individual's risk of having a condition. To date, this opportunity has largely remained unexploited, partly due to the lack of suitable statistical methods. The key challenge is the positive-unlabeled EHR data structure, which consists of data for diagnosed (\"positive\") patients and the remaining (\"unlabeled\") that include underdiagnosed patients and many condition-free patients. Therefore, data for patients who are unambiguously condition-free, essential for developing risk assessment models, are unavailable. To overcome this challenge, we propose ascertaining condition statuses for a small subset of unlabeled patients. We develop a novel statistical method for building accurate models using this supplemented EHR data to estimate the probability that a patient has the condition of interest. We study the asymptotic properties of our method and assess its finite-sample performance through simulation studies. Finally, we apply our method to develop a preliminary model for identifying potentially underdiagnosed non-alcoholic steatohepatitis patients using data from Penn Medicine EHRs.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665971/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf163
Tiphaine Saulnier, Wassilios G Meissner, Margherita Fabbri, Alexandra Foubert-Samier, Cécile Proust-Lima
{"title":"Structuring, sequencing, staging, selecting: the 4S method for the longitudinal analysis of multidimensional questionnaires in chronic diseases.","authors":"Tiphaine Saulnier, Wassilios G Meissner, Margherita Fabbri, Alexandra Foubert-Samier, Cécile Proust-Lima","doi":"10.1093/biomtc/ujaf163","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf163","url":null,"abstract":"<p><p>In clinical studies, questionnaires are often used to report disease-related manifestations from clinician and/or patient perspectives. Their analysis can help identify relevant manifestations throughout the disease course, enhancing knowledge of disease progression and guiding clinicians in appropriate care provision. However, the analysis of questionnaires in health studies is not straightforward as made of repeated, ordinal, and potentially multidimensional item data. Sum-score summaries may considerably reduce information and hamper interpretation; items' changes over time occur along clinical progression; and as many other longitudinal processes, observations may be truncated by events. This work establishes a comprehensive strategy in four consecutive steps to leverage repeated ordinal data from multidimensional questionnaires. The 4S method successively (1) identifies the questionnaire structure into dimensions satisfying three calibration assumptions (unidimensionality, conditional independence, increasing monotonicity), (2) describes each dimension progression using a joint latent process model which includes a continuous-time item response theory model for the longitudinal subpart, (3) aligns each dimension progression with disease stages through a projection approach, and (4) identifies the most informative items across disease stages using the Fisher information. The method is applied to multiple system atrophy (MSA), a rare neurodegenerative disease, with the analysis of daily activity and motor impairments over disease progression. The 4S method provides an effective and complete analytical strategy for questionnaires repeatedly collected in health studies.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145832991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf139
{"title":"Correction to: Covariate-Adjusted Response-Adaptive Randomization for Multi-Arm Clinical Trials Using a Modified Forward Looking Gittins Index Rule.","authors":"","doi":"10.1093/biomtc/ujaf139","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf139","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145372147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf160
Byran J Smucker, Stephen E Wright, Isaac Williams, Richard C Page, Andor J Kiss, Surendra Bikram Silwal, Maria Weese, David J Edwards
{"title":"Large row-constrained supersaturated designs for high-throughput screening.","authors":"Byran J Smucker, Stephen E Wright, Isaac Williams, Richard C Page, Andor J Kiss, Surendra Bikram Silwal, Maria Weese, David J Edwards","doi":"10.1093/biomtc/ujaf160","DOIUrl":"10.1093/biomtc/ujaf160","url":null,"abstract":"<p><p>High-throughput screening, in which large numbers of compounds are traditionally studied one-at-a-time in multiwell plates against specific targets, is widely used across many areas of the biological sciences, including drug discovery. To improve the effectiveness of these screens, we propose a new class of supersaturated designs that guide the construction of pools of compounds in each well. Because the size of the pools is typically limited by the particular application, the new designs accommodate this constraint and are part of a larger procedure that we call Constrained Row Screening or CRowS. We develop an efficient computational procedure to construct the CRowS designs, provide some initial lower bounds on the average squared off-diagonal values of their main-effects information matrix, and study the impact of the constraint on design quality. We also show via simulation that CRowS is statistically superior to the traditional one-compound-one-well approach as well as an existing pooling method, and demonstrate the use of the new methodology on a Verona Integron-encoded Metallo-$beta$-lactamase-2 assay.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12696866/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145720530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf165
Xuran Meng, Jingfei Zhang, Yi Li
{"title":"Statistical inference on high-dimensional covariate-dependent Gaussian graphical regressions.","authors":"Xuran Meng, Jingfei Zhang, Yi Li","doi":"10.1093/biomtc/ujaf165","DOIUrl":"10.1093/biomtc/ujaf165","url":null,"abstract":"<p><p>In many genomic studies, gene co-expression graphs are influenced by subject-level covariates like single nucleotide polymorphisms. Traditional Gaussian graphical models ignore these covariates and estimate only population-level networks, potentially masking important heterogeneity. Covariate-dependent Gaussian graphical regressions address this limitation by regressing the precision matrix on covariates, thereby modeling how graph structures vary with high-dimensional subject-specific covariates. To fit the model, we adopt a multi-task learning approach that achieves lower error rates than node-wise regressions. Yet, the important problem of statistical inference in this setting remains largely unexplored. We propose a class of debiased estimators based on multi-task learners, which can be computed quickly and separately. In a key step, we introduce a novel projection technique for estimating the inverse covariance matrix, reducing optimization costs to scale with the sample size n. Our debiased estimators achieve fast convergence and asymptotic normality, enabling valid inference. Simulations demonstrate the utility of the method, and an application to a brain cancer gene-expression dataset reveals meaningful biological relationships.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12720500/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145802935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf148
Guangyu Yang, Min Zhang
{"title":"Rejoinder to Letter to the Editors \"Comments on 'Statistical inference on change points in generalized semiparametric segmented models' by Yang et al. (2025)\" by Vito M.R. Muggeo.","authors":"Guangyu Yang, Min Zhang","doi":"10.1093/biomtc/ujaf148","DOIUrl":"10.1093/biomtc/ujaf148","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145602083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf129
Jieru Shi, Walter Dempsey
{"title":"A meta-learning method for estimation of causal excursion effects to assess time-varying moderation.","authors":"Jieru Shi, Walter Dempsey","doi":"10.1093/biomtc/ujaf129","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf129","url":null,"abstract":"<p><p>Advances in wearable technologies and health interventions delivered by smartphones have greatly increased the accessibility of mobile health (mHealth) interventions. Micro-randomized trials (MRTs) are designed to assess the effectiveness of the mHealth intervention and introduce a novel class of causal estimands called \"causal excursion effects.\" These estimands enable the evaluation of how intervention effects change over time and are influenced by individual characteristics or context. Existing methods for analyzing causal excursion effects assume known randomization probabilities, complete observations, and a linear nuisance function with prespecified features of the high-dimensional observed history. However, in complex mobile systems, these assumptions often fall short: randomization probabilities can be uncertain, observations may be incomplete, and the granularity of mHealth data makes linear modeling difficult. To address this issue, we propose a flexible and doubly robust inferential procedure, called \"DR-WCLS,\" for estimating causal excursion effects from a meta-learner perspective. We present the bidirectional asymptotic properties of the proposed estimators and compare them with existing methods both theoretically and through extensive simulations. The results show a consistent and more efficient estimate, even with missing observations or uncertain treatment randomization probabilities. Finally, the practical utility of the proposed methods is demonstrated by analyzing data from a multi-institution cohort of first-year medical residents in the United States.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145249539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometricsPub Date : 2025-10-08DOI: 10.1093/biomtc/ujaf164
Yen Chang, Anastasia Ivanova, Demetrius Albanes, Jason P Fine, Yei Eun Shin
{"title":"Prediction of transition probabilities in multi-state models with nested case-control data.","authors":"Yen Chang, Anastasia Ivanova, Demetrius Albanes, Jason P Fine, Yei Eun Shin","doi":"10.1093/biomtc/ujaf164","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf164","url":null,"abstract":"<p><p>Multi-state models are widely used to study complex interrelated life events. In resource-limited settings, nested case-control (NCC) sampling may be employed to extract subsamples from a cohort for an event of interest, followed by a conditional likelihood analysis. However, conditioning restricts the reuse of NCC data for studying additional events. An alternative approach constructs pseudolikelihoods using inverse probability weighting (IPW) for inference with NCC data. Existing IPW-based pseudolikelihood methods focus primarily on estimating relative risks for multiple outcomes or secondary endpoints. In this work, we extend these methods to predict transition probabilities under general multi-state models and evaluate their efficiency. As the standard IPW methods for the prediction of transition probabilities may suffer from inefficiency, we propose two novel approaches for more efficient prediction and derive explicit variance estimates for these methods. The first approach calibrates the design weights using cohort-level information, while the second jointly models transitions originating from the same state. A simulation study demonstrates that either approach substantially improves efficiency and that their combined application yields further gains. We illustrate these methods with real data from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145802938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}