Kenta Takatsu, Alexander W Levis, Edward Kennedy, Rachel Kelz, Luke Keele
{"title":"Doubly robust machine learning-based estimation methods for instrumental variables with an application to surgical care for cholecystitis.","authors":"Kenta Takatsu, Alexander W Levis, Edward Kennedy, Rachel Kelz, Luke Keele","doi":"10.1093/jrsssa/qnae089","DOIUrl":"10.1093/jrsssa/qnae089","url":null,"abstract":"<p><p>Comparative effectiveness research frequently employs the instrumental variable design since randomized trials can be infeasible for many reasons. In this study, we investigate treatments for emergency <i>cholecystitis</i>-inflammation of the gallbladder. A standard treatment for cholecystitis is surgical removal of the gallbladder, while alternative non-surgical treatments include managed care and pharmaceutical options. As randomized trials are judged to violate the principle of equipoise, we consider an instrument for operative care: the surgeon's tendency to operate. Standard instrumental variable estimation methods, however, often rely on parametric models that are prone to bias from model misspecification. Thus, we outline instrumental variable methods based on the doubly robust machine learning framework. These methods enable us to employ various machine learning techniques, delivering consistent estimates, and permitting valid inference on various estimands. We use these methods to estimate the primary target estimand in an instrumental variable design. Additionally, we expand these methods to develop new estimators for heterogeneous causal effects, profiling principal strata, and sensitivity analyses for a key instrumental variable assumption. We conduct a simulation study to demonstrate scenarios where more flexible estimation methods outperform standard methods. Our findings indicate that operative care is generally more effective for cholecystitis patients, although the benefits of surgery can be less pronounced for key patient subgroups.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12223449/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144692227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesca Gasperoni, Christopher H Jackson, Angela M Wood, Michael J Sweeting, Paul J Newcombe, David Stevens, Jessica K Barrett
{"title":"Optimal risk-assessment scheduling for primary prevention of cardiovascular disease.","authors":"Francesca Gasperoni, Christopher H Jackson, Angela M Wood, Michael J Sweeting, Paul J Newcombe, David Stevens, Jessica K Barrett","doi":"10.1093/jrsssa/qnae086","DOIUrl":"10.1093/jrsssa/qnae086","url":null,"abstract":"<p><p>In this work, we introduce a personalized and age-specific net benefit function, composed of benefits and costs, to recommend optimal timing of risk assessments for cardiovascular disease (CVD) prevention. We extend the 2-stage landmarking model to estimate patient-specific CVD risk profiles, adjusting for time-varying covariates. We apply our model to data from the Clinical Practice Research Datalink, comprising primary care electronic health records from the UK. We find that people at lower risk could be recommended an optimal risk-assessment interval of 5 years or more. Time-varying risk factors are required to discriminate between more frequent schedules for high-risk people.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 3","pages":"920-934"},"PeriodicalIF":1.5,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12256122/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144638527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joshua L Warren, Ottavia Prunas, A David Paltiel, Thomas Thornhill, Gregg S Gonsalves
{"title":"Integrating testing volume into bandit algorithms for infectious disease surveillance.","authors":"Joshua L Warren, Ottavia Prunas, A David Paltiel, Thomas Thornhill, Gregg S Gonsalves","doi":"10.1093/jrsssa/qnae090","DOIUrl":"10.1093/jrsssa/qnae090","url":null,"abstract":"<p><p>Mobile testing services provide opportunities for active surveillance of infectious diseases for hard-to-reach and/or high-risk individuals who do not know their disease status. Identifying as many infected individuals as possible is important for mitigating disease transmission. Recently, multi-armed bandit sampling approaches have been adapted and applied in this setting to maximize the cumulative number of positive tests collected over time. However, these algorithms have not considered the possibility of variability in the number of tests administered across testing sites. What impact this variability has on the ability of these approaches to maximize yield is currently unknown. Therefore, we investigate this question by extending existing sampling frameworks to directly account for variability in testing volume while also maintaining the computational tractability of the previous methods. Through a simulation study based on human immunodeficiency virus infection characteristics in the Republic of the Congo (Congo-Brazzaville) as well as an application to COVID-19 testing data in Connecticut, we find improved long- and short-term performances of the new methods compared to several existing approaches. Based on these findings and the ease of computation, we recommend use of the newly developed methods for active surveillance of infectious diseases when variability in testing volume may be present.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 4","pages":"1029-1043"},"PeriodicalIF":1.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12503114/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145253487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul N Zivich, Jessie K Edwards, Bonnie E Shook-Sa, Eric T Lofgren, Justin Lessler, Stephen R Cole
{"title":"Synthesis estimators for transportability with positivity violations by a continuous covariate.","authors":"Paul N Zivich, Jessie K Edwards, Bonnie E Shook-Sa, Eric T Lofgren, Justin Lessler, Stephen R Cole","doi":"10.1093/jrsssa/qnae084","DOIUrl":"10.1093/jrsssa/qnae084","url":null,"abstract":"<p><p>Studies intended to estimate the effect of a treatment, like randomized trials, may not be sampled from the desired target population. To correct for this discrepancy, estimates can be transported to the target population. Methods for transporting between populations are often premised on a positivity assumption, such that all relevant covariate patterns in one population are also present in the other. However, eligibility criteria, particularly in the case of trials, can result in violations of positivity when transporting to external populations. To address nonpositivity, a synthesis of statistical and mathematical models can be considered. This approach integrates multiple data sources (e.g. trials, observational, pharmacokinetic studies) to estimate treatment effects, leveraging mathematical models to handle positivity violations. This approach was previously demonstrated for positivity violations by a single binary covariate. Here, we extend the synthesis approach for positivity violations with a continuous covariate. For estimation, two novel augmented inverse probability weighting estimators are proposed. Both estimators are contrasted with other common approaches for addressing nonpositivity. Empirical performance is compared via Monte Carlo simulation. Finally, the competing approaches are illustrated with an example in the context of two-drug vs. one-drug antiretroviral therapy on CD4 T cell counts among women with HIV.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 1","pages":"158-180"},"PeriodicalIF":1.6,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11728055/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142985305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuzi Zhang, Howard H Chang, Angela D Iuliano, Carrie Reed
{"title":"A Bayesian spatial-temporal varying coefficients model for estimating excess deaths associated with respiratory infections.","authors":"Yuzi Zhang, Howard H Chang, Angela D Iuliano, Carrie Reed","doi":"10.1093/jrsssa/qnae079","DOIUrl":"10.1093/jrsssa/qnae079","url":null,"abstract":"<p><p>Disease surveillance data are used for monitoring and understanding disease burden, which provides valuable information in allocating health programme resources. Statistical methods play an important role in estimating disease burden since disease surveillance systems are prone to undercounting. This paper is motivated by the challenge of estimating mortality associated with respiratory infections (e.g. influenza and COVID-19) that are not ascertained from death certificates. We propose a Bayesian spatial-temporal model incorporating measures of infection activity to estimate excess deaths. Particularly, the inclusion of time-varying coefficients allows us to better characterize associations between infection activity and mortality counts time series. Software to implement this method is available in the R package NBRegAD. Applying our modelling framework to weekly state-wide COVID-19 data in the US from 8 March 2020 to 3 July 2022, we identified temporal and spatial differences in excess deaths between different age groups. We estimated the total number of COVID-19 deaths in the US to be 1,168,481 (95% CI: 1,148,953 1,187,187) compared to the 1,022,147 from using only death certificate information. The analysis also suggests that the most severe undercounting was in the 18-49 years age group with an estimated underascertainment rate of 0.21 (95% CI: 0.16, 0.25).</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 3","pages":"843-858"},"PeriodicalIF":1.5,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12256124/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144638526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lingxiao Wang, Yan Li, Barry I Graubard, Hormuzd A Katki
{"title":"Data-integration with pseudoweights and survey-calibration: application to developing US-representative lung cancer risk models for use in screening.","authors":"Lingxiao Wang, Yan Li, Barry I Graubard, Hormuzd A Katki","doi":"10.1093/jrsssa/qnae059","DOIUrl":"10.1093/jrsssa/qnae059","url":null,"abstract":"<p><p>Accurate cancer risk estimation is crucial to clinical decision-making, such as identifying high-risk people for screening. However, most existing cancer risk models incorporate data from epidemiologic studies, which usually cannot represent the target population. While population-based health surveys are ideal for making inference to the target population, they typically do not collect time-to-cancer incidence data. Instead, time-to-cancer specific mortality is often readily available on surveys via linkage to vital statistics. We develop calibrated pseudoweighting methods that integrate individual-level data from a cohort and a survey, and summary statistics of cancer incidence from national cancer registries. By leveraging individual-level cancer mortality data in the survey, the proposed methods impute time-to-cancer incidence for survey sample individuals and use survey calibration with auxiliary variables of influence functions generated from Cox regression to improve robustness and efficiency of the inverse-propensity pseudoweighting method in estimating pure risks. We develop a lung cancer incidence pure risk model from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial using our proposed methods by integrating data from the National Health Interview Survey and cancer registries.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 1","pages":"119-139"},"PeriodicalIF":1.5,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11728053/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142985289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ritoban Kundu, Xu Shi, Jean Morrison, Jessica Barrett, Bhramar Mukherjee
{"title":"A framework for understanding selection bias in real-world healthcare data.","authors":"Ritoban Kundu, Xu Shi, Jean Morrison, Jessica Barrett, Bhramar Mukherjee","doi":"10.1093/jrsssa/qnae039","DOIUrl":"10.1093/jrsssa/qnae039","url":null,"abstract":"<p><p>Using administrative patient-care data such as Electronic Health Records (EHR) and medical/pharmaceutical claims for population-based scientific research has become increasingly common. With vast sample sizes leading to very small standard errors, researchers need to pay more attention to potential biases in the estimates of association parameters of interest, specifically to biases that do not diminish with increasing sample size. Of these multiple sources of biases, in this paper, we focus on understanding selection bias. We present an analytic framework using directed acyclic graphs for guiding applied researchers to dissect how different sources of selection bias may affect estimates of the association between a binary outcome and an exposure (continuous or categorical) of interest. We consider four easy-to-implement weighting approaches to reduce selection bias with accompanying variance formulae. We demonstrate through a simulation study when they can rescue us in practice with analysis of real-world data. We compare these methods using a data example where our goal is to estimate the well-known association of cancer and biological sex, using EHR from a longitudinal biorepository at the University of Michigan Healthcare system. We provide annotated R codes to implement these weighted methods with associated inference.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"187 3","pages":"606-635"},"PeriodicalIF":1.5,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393555/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dr Arun Chind’s contribution to the Discussion of “A system of population estimates compiled from administrative data only” by Dunne and Zhang","authors":"A. Chind","doi":"10.1093/jrsssa/qnad119","DOIUrl":"https://doi.org/10.1093/jrsssa/qnad119","url":null,"abstract":"","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"9 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79677215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Psychometrics of Standard Setting","authors":"Andrew Mcculloch","doi":"10.1093/jrsssa/qnad108","DOIUrl":"https://doi.org/10.1093/jrsssa/qnad108","url":null,"abstract":"","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"6 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78062263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measurement Models for Psychological Attributes","authors":"Andrew Mcculloch","doi":"10.1093/jrsssa/qnad107","DOIUrl":"https://doi.org/10.1093/jrsssa/qnad107","url":null,"abstract":"","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"32 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75777254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}