Ryan Hebdon, James Stamey, David Kahle, Xiang Zhang
{"title":"unmconf : an R package for Bayesian regression with unmeasured confounders.","authors":"Ryan Hebdon, James Stamey, David Kahle, Xiang Zhang","doi":"10.1186/s12874-024-02322-2","DOIUrl":"10.1186/s12874-024-02322-2","url":null,"abstract":"<p><p>The inability to correctly account for unmeasured confounding can lead to bias in parameter estimates, invalid uncertainty assessments, and erroneous conclusions. Sensitivity analysis is an approach to investigate the impact of unmeasured confounding in observational studies. However, the adoption of this approach has been slow given the lack of accessible software. An extensive review of available R packages to account for unmeasured confounding list deterministic sensitivity analysis methods, but no R packages were listed for probabilistic sensitivity analysis. The R package unmconf implements the first available package for probabilistic sensitivity analysis through a Bayesian unmeasured confounding model. The package allows for normal, binary, Poisson, or gamma responses, accounting for one or two unmeasured confounders from the normal or binomial distribution. The goal of unmconf is to implement a user friendly package that performs Bayesian modeling in the presence of unmeasured confounders, with simple commands on the front end while performing more intensive computation on the back end. We investigate the applicability of this package through novel simulation studies. The results indicate that credible intervals will have near nominal coverage probability and smaller bias when modeling the unmeasured confounder(s) for varying levels of internal/external validation data across various combinations of response-unmeasured confounder distributional families.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"195"},"PeriodicalIF":3.9,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11380322/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142145123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyu Lai, Kaiye Gao, Meiyan Li, Tao Li, Xiaodong Zhou, Xingtao Zhou, Hui Guo, Bo Fu
{"title":"Handling missing data and measurement error for early-onset myopia risk prediction models.","authors":"Hongyu Lai, Kaiye Gao, Meiyan Li, Tao Li, Xiaodong Zhou, Xingtao Zhou, Hui Guo, Bo Fu","doi":"10.1186/s12874-024-02319-x","DOIUrl":"10.1186/s12874-024-02319-x","url":null,"abstract":"<p><strong>Background: </strong>Early identification of children at high risk of developing myopia is essential to prevent myopia progression by introducing timely interventions. However, missing data and measurement error (ME) are common challenges in risk prediction modelling that can introduce bias in myopia prediction.</p><p><strong>Methods: </strong>We explore four imputation methods to address missing data and ME: single imputation (SI), multiple imputation under missing at random (MI-MAR), multiple imputation with calibration procedure (MI-ME), and multiple imputation under missing not at random (MI-MNAR). We compare four machine-learning models (Decision Tree, Naive Bayes, Random Forest, and Xgboost) and three statistical models (logistic regression, stepwise logistic regression, and least absolute shrinkage and selection operator logistic regression) in myopia risk prediction. We apply these models to the Shanghai Jinshan Myopia Cohort Study and also conduct a simulation study to investigate the impact of missing mechanisms, the degree of ME, and the importance of predictors on model performance. Model performance is evaluated using the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC).</p><p><strong>Results: </strong>Our findings indicate that in scenarios with missing data and ME, using MI-ME in combination with logistic regression yields the best prediction results. In scenarios without ME, employing MI-MAR to handle missing data outperforms SI regardless of the missing mechanisms. When ME has a greater impact on prediction than missing data, the relative advantage of MI-MAR diminishes, and MI-ME becomes more superior. Furthermore, our results demonstrate that statistical models exhibit better prediction performance than machine-learning models.</p><p><strong>Conclusion: </strong>MI-ME emerges as a reliable method for handling missing data and ME in important predictors for early-onset myopia risk prediction.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"194"},"PeriodicalIF":3.9,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11378546/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142145122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rheanna M Mainzer, Margarita Moreno-Betancur, Cattram D Nguyen, Julie A Simpson, John B Carlin, Katherine J Lee
{"title":"Gaps in the usage and reporting of multiple imputation for incomplete data: findings from a scoping review of observational studies addressing causal questions.","authors":"Rheanna M Mainzer, Margarita Moreno-Betancur, Cattram D Nguyen, Julie A Simpson, John B Carlin, Katherine J Lee","doi":"10.1186/s12874-024-02302-6","DOIUrl":"10.1186/s12874-024-02302-6","url":null,"abstract":"<p><strong>Background: </strong>Missing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions (\"missing completely at random\", \"missing at random\" [MAR], \"missing not at random\") are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation.</p><p><strong>Methods: </strong>We searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically.</p><p><strong>Results: </strong>Of the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis.</p><p><strong>Conclusion: </strong>Effort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"193"},"PeriodicalIF":3.9,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11373423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142131834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Patient regional index: a new way to rank clinical specialties based on outpatient clinics big data.","authors":"Xiaoling Peng, Moyuan Huang, Xinyang Li, Tianyi Zhou, Guiping Lin, Xiaoguang Wang","doi":"10.1186/s12874-024-02309-z","DOIUrl":"10.1186/s12874-024-02309-z","url":null,"abstract":"<p><strong>Background: </strong>Many existing healthcare ranking systems are notably intricate. The standards for peer review and evaluation often differ across specialties, leading to contradictory results among various ranking systems. There is a significant need for a comprehensible and consistent mode of specialty assessment.</p><p><strong>Methods: </strong>This quantitative study aimed to assess the influence of clinical specialties on the regional distribution of patient origins based on 10,097,795 outpatient records of a large comprehensive hospital in South China. We proposed the patient regional index (PRI), a novel metric to quantify the regional influence of hospital specialties, using the principle of representative points of a statistical distribution. Additionally, a two-dimensional measure was constructed to gauge the significance of hospital specialties by integrating the PRI and outpatient volume.</p><p><strong>Results: </strong>We calculated the PRI for each of the 16 specialties of interest over eight consecutive years. The longitudinal changes in the PRI accurately captured the impact of the 2017 Chinese healthcare reforms and the 2020 COVID-19 pandemic on hospital specialties. At last, the two-dimensional assessment model we devised effectively illustrates the distinct characteristics across hospital specialties.</p><p><strong>Conclusion: </strong>We propose a novel, straightforward, and interpretable index for quantifying the influence of hospital specialties. This index, built on outpatient data, requires only the patients' origin, thereby facilitating its widespread adoption and comparison across specialties of varying backgrounds. This data-driven method offers a patient-centric view of specialty influence, diverging from the traditional reliance on expert opinions. As such, it serves as a valuable augmentation to existing ranking systems.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"192"},"PeriodicalIF":3.9,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365139/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142104277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Imad El Badisy, Nathalie Graffeo, Mohamed Khalis, Roch Giorgi
{"title":"Multi-metric comparison of machine learning imputation methods with application to breast cancer survival.","authors":"Imad El Badisy, Nathalie Graffeo, Mohamed Khalis, Roch Giorgi","doi":"10.1186/s12874-024-02305-3","DOIUrl":"https://doi.org/10.1186/s12874-024-02305-3","url":null,"abstract":"<p><p>Handling missing data in clinical prognostic studies is an essential yet challenging task. This study aimed to provide a comprehensive assessment of the effectiveness and reliability of different machine learning (ML) imputation methods across various analytical perspectives. Specifically, it focused on three distinct classes of performance metrics used to evaluate ML imputation methods: post-imputation bias of regression estimates, post-imputation predictive accuracy, and substantive model-free metrics. As an illustration, we applied data from a real-world breast cancer survival study. This comprehensive approach aimed to provide a thorough assessment of the effectiveness and reliability of ML imputation methods across various analytical perspectives. A simulated dataset with 30% Missing At Random (MAR) values was used. A number of single imputation (SI) methods - specifically KNN, missMDA, CART, missForest, missRanger, missCforest - and multiple imputation (MI) methods - specifically miceCART and miceRF - were evaluated. The performance metrics used were Gower's distance, estimation bias, empirical standard error, coverage rate, length of confidence interval, predictive accuracy, proportion of falsely classified (PFC), normalized root mean squared error (NRMSE), AUC, and C-index scores. The analysis revealed that in terms of Gower's distance, CART and missForest were the most accurate, while missMDA and CART excelled for binary covariates; missForest and miceCART were superior for continuous covariates. When assessing bias and accuracy in regression estimates, miceCART and miceRF exhibited the least bias. Overall, the various imputation methods demonstrated greater efficiency than complete-case analysis (CCA), with MICE methods providing optimal confidence interval coverage. In terms of predictive accuracy for Cox models, missMDA and missForest had superior AUC and C-index scores. Despite offering better predictive accuracy, the study found that SI methods introduced more bias into the regression coefficients compared to MI methods. This study underlines the importance of selecting appropriate imputation methods based on study goals and data types in time-to-event research. The varying effectiveness of methods across the different performance metrics studied highlights the value of using advanced machine learning algorithms within a multiple imputation framework to enhance research integrity and the robustness of findings.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"191"},"PeriodicalIF":3.9,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11363416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142104269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Schalk, Raphael Rehms, Verena S Hoffmann, Bernd Bischl, Ulrich Mansmann
{"title":"Distributed non-disclosive validation of predictive models by a modified ROC-GLM.","authors":"Daniel Schalk, Raphael Rehms, Verena S Hoffmann, Bernd Bischl, Ulrich Mansmann","doi":"10.1186/s12874-024-02312-4","DOIUrl":"https://doi.org/10.1186/s12874-024-02312-4","url":null,"abstract":"<p><strong>Background: </strong>Distributed statistical analyses provide a promising approach for privacy protection when analyzing data distributed over several databases. Instead of directly operating on data, the analyst receives anonymous summary statistics, which are combined into an aggregated result. Further, in discrimination model (prognosis, diagnosis, etc.) development, it is key to evaluate a trained model w.r.t. to its prognostic or predictive performance on new independent data. For binary classification, quantifying discrimination uses the receiver operating characteristics (ROC) and its area under the curve (AUC) as aggregation measure. We are interested to calculate both as well as basic indicators of calibration-in-the-large for a binary classification task using a distributed and privacy-preserving approach.</p><p><strong>Methods: </strong>We employ DataSHIELD as the technology to carry out distributed analyses, and we use a newly developed algorithm to validate the prediction score by conducting distributed and privacy-preserving ROC analysis. Calibration curves are constructed from mean values over sites. The determination of ROC and its AUC is based on a generalized linear model (GLM) approximation of the true ROC curve, the ROC-GLM, as well as on ideas of differential privacy (DP). DP adds noise (quantified by the <math><msub><mi>ℓ</mi> <mn>2</mn></msub> </math> sensitivity <math> <mrow><msub><mi>Δ</mi> <mn>2</mn></msub> <mrow><mo>(</mo> <mover><mi>f</mi> <mo>^</mo></mover> <mo>)</mo></mrow> </mrow> </math> ) to the data and enables a global handling of placement numbers. The impact of DP parameters was studied by simulations.</p><p><strong>Results: </strong>In our simulation scenario, the true and distributed AUC measures differ by <math><mrow><mi>Δ</mi> <mtext>AUC</mtext> <mo><</mo> <mn>0.01</mn></mrow> </math> depending heavily on the choice of the differential privacy parameters. It is recommended to check the accuracy of the distributed AUC estimator in specific simulation scenarios along with a reasonable choice of DP parameters. Here, the accuracy of the distributed AUC estimator may be impaired by too much artificial noise added from DP.</p><p><strong>Conclusions: </strong>The applicability of our algorithms depends on the <math><msub><mi>ℓ</mi> <mn>2</mn></msub> </math> sensitivity <math> <mrow><msub><mi>Δ</mi> <mn>2</mn></msub> <mrow><mo>(</mo> <mover><mi>f</mi> <mo>^</mo></mover> <mo>)</mo></mrow> </mrow> </math> of the underlying statistical/predictive model. The simulations carried out have shown that the approximation error is acceptable for the majority of simulated cases. For models with high <math> <mrow><msub><mi>Δ</mi> <mn>2</mn></msub> <mrow><mo>(</mo> <mover><mi>f</mi> <mo>^</mo></mover> <mo>)</mo></mrow> </mrow> </math> , the privacy parameters must be set accordingly higher to ensure sufficient privacy protection, which affects the approximation error. This work shows that complex measures, as the AUC","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"190"},"PeriodicalIF":3.9,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11363434/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142104268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alejandro Villasante-Tezanos, Yong-Fang Kuo, Christopher Kurinec, Yisheng Li, Xiaoying Yu
{"title":"A non-parametric approach to predict the recruitment for randomized clinical trials: an example in elderly inpatient settings.","authors":"Alejandro Villasante-Tezanos, Yong-Fang Kuo, Christopher Kurinec, Yisheng Li, Xiaoying Yu","doi":"10.1186/s12874-024-02314-2","DOIUrl":"10.1186/s12874-024-02314-2","url":null,"abstract":"<p><strong>Background: </strong>Accurate prediction of subject recruitment, which is critical to the success of a study, remains an ongoing challenge. Previous prediction models often rely on parametric assumptions which are not always met or may be difficult to implement. We aim to develop a novel method that is less sensitive to model assumptions and relatively easy to implement.</p><p><strong>Methods: </strong>We create a weighted resampling-based approach to predict enrollment in year two based on recruitment data from year one of the completed GRIPS and PACE clinical trials. Different weight functions accounted for a range of potential enrollment trajectory patterns. Prediction accuracy was measured by Euclidean distance for enrollment sequence in year two, total enrollment over time, and total weeks to enroll a fixed number of subjects, against the actual year two enrollment data. We compare the performance of the proposed method with an existing Bayesian method.</p><p><strong>Results: </strong>Weighted resampling using GRIPS data resulted in closer prediction evidenced by better coverage of observed enrollment with the prediction intervals and smaller Euclidean distance from actual enrollment in year 2, especially when enrollment gaps were filled prior to the weighted resampling. These scenarios also produced more accurate predictions for total enrollment and number of weeks to enroll 50 participants. These same scenarios outperformed an existing Bayesian method for all 3 accuracy measures. In PACE data, using a reduced year 1 enrollment resulted in closer prediction evidenced by better coverage of observed enrollment with the prediction intervals and smaller Euclidean distance from actual enrollment in year 2, with the weighted resampling scenarios better reflecting the seasonal variation seen in year (1) The reduced enrollment scenarios resulted in closer prediction for total enrollment over 6 and 12 months into year (2) These same scenarios also outperformed an existing Bayesian method for relevant accuracy measures.</p><p><strong>Conclusion: </strong>The results demonstrate the feasibility and flexibility for a resampling-based, non-parametric approach for prediction of clinical trial recruitment with limited early enrollment data. Application to a wider setting and long-term prediction accuracy require further investigation.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"189"},"PeriodicalIF":3.9,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11363376/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142104267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review.","authors":"Marziyeh Afkanpour, Elham Hosseinzadeh, Hamed Tabesh","doi":"10.1186/s12874-024-02310-6","DOIUrl":"10.1186/s12874-024-02310-6","url":null,"abstract":"<p><strong>Background and objectives: </strong>Comprehending the research dataset is crucial for obtaining reliable and valid outcomes. Health analysts must have a deep comprehension of the data being analyzed. This comprehension allows them to suggest practical solutions for handling missing data, in a clinical data source. Accurate handling of missing values is critical for producing precise estimates and making informed decisions, especially in crucial areas like clinical research. With data's increasing diversity and complexity, numerous scholars have developed a range of imputation techniques. To address this, we conducted a systematic review to introduce various imputation techniques based on tabular dataset characteristics, including the mechanism, pattern, and ratio of missingness, to identify the most appropriate imputation methods in the healthcare field.</p><p><strong>Materials and methods: </strong>We searched four information databases namely PubMed, Web of Science, Scopus, and IEEE Xplore, for articles published up to September 20, 2023, that discussed imputation methods for addressing missing values in a clinically structured dataset. Our investigation of selected articles focused on four key aspects: the mechanism, pattern, ratio of missingness, and various imputation strategies. By synthesizing insights from these perspectives, we constructed an evidence map to recommend suitable imputation methods for handling missing values in a tabular dataset.</p><p><strong>Results: </strong>Out of 2955 articles, 58 were included in the analysis. The findings from the development of the evidence map, based on the structure of the missing values and the types of imputation methods used in the extracted items from these studies, revealed that 45% of the studies employed conventional statistical methods, 31% utilized machine learning and deep learning methods, and 24% applied hybrid imputation techniques for handling missing values.</p><p><strong>Conclusion: </strong>Considering the structure and characteristics of missing values in a clinical dataset is essential for choosing the most appropriate data imputation technique, especially within conventional statistical methods. Accurately estimating missing values to reflect reality enhances the likelihood of obtaining high-quality and reusable data, contributing significantly to precise medical decision-making processes. Performing this review study creates a guideline for choosing the most appropriate imputation methods in data preprocessing stages to perform analytical processes on structured clinical datasets.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"188"},"PeriodicalIF":3.9,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11351057/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142092238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guneet S Janda, Molly Moore Jeffery, Reshma Ramachandran, Joseph S Ross, Joshua D Wallach
{"title":"Feasibility of using real-world data to emulate substance use disorder clinical trials: a cross-sectional study.","authors":"Guneet S Janda, Molly Moore Jeffery, Reshma Ramachandran, Joseph S Ross, Joshua D Wallach","doi":"10.1186/s12874-024-02307-1","DOIUrl":"10.1186/s12874-024-02307-1","url":null,"abstract":"<p><strong>Introduction: </strong>Real-world evidence is receiving considerable attention as a way to evaluate the efficacy and safety of medical products for substance use disorders (SUDs). However, the feasibility of using real-world data (RWD) to emulate clinical trials evaluating treatments for SUDs is uncertain. The aim of this study is to identify the number of clinical trials evaluating treatments for SUDs with reported results that could be feasibly emulated using observational data from contemporary insurance claims and/or electronic health record (EHR) data.</p><p><strong>Methods: </strong>In this cross-sectional study, all phase 2-4 trials evaluating treatments for SUDs registered on ClinicalTrials.gov with reported results were identified. Each trial was evaluated to determine if the indications, interventions, at least 80% of eligibility criteria, comparators, and primary end points could be ascertained using contemporarily available administrative claims and/or structured EHR data.</p><p><strong>Results: </strong>There were 272 SUD trials on ClinicalTrials.gov with reported results. Of these, when examining feasibility using contemporarily available administrative claims and/or structured EHR data, 262 (96.3%) had indications that were ascertainable; 194 (71.3%) had interventions that were ascertainable; 21 (7.7%) had at least 80% of eligibility criteria that were ascertainable; 17 (6.3%) had active comparators that were ascertainable; and 61 (22.4%) had primary end points that were ascertainable. In total, there were no trials for which all 5 characteristics were ascertainable using contemporarily available administrative claims and/or structured EHR data. When considering placebo comparators as ascertainable, there were 6 (2.2%) trials that had all 5 key characteristics classified as ascertainable from contemporarily available administrative claims and/or structured EHR data.</p><p><strong>Conclusions: </strong>No trials evaluating treatments for SUDs could be feasibly emulated using contemporarily available RWD, demonstrating a need for an increase in the resolution of data capture within a public health system to facilitate trial emulation.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"187"},"PeriodicalIF":3.9,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11351457/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142092237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing treatment effects with adjusted restricted mean time lost in observational competing risks data.","authors":"Haoning Shen, Chengfeng Zhang, Yu Song, Zhiheng Huang, Yanjie Wang, Yawen Hou, Zheng Chen","doi":"10.1186/s12874-024-02303-5","DOIUrl":"10.1186/s12874-024-02303-5","url":null,"abstract":"<p><strong>Background: </strong>According to long-term follow-up data of malignant tumor patients, assessing treatment effects requires careful consideration of competing risks. The commonly used cause-specific hazard ratio (CHR) and sub-distribution hazard ratio (SHR) are relative indicators and may present challenges in terms of proportional hazards assumption and clinical interpretation. Recently, the restricted mean time lost (RMTL) has been recommended as a supplementary measure for better clinical interpretation. Moreover, for observational study data in epidemiological and clinical settings, due to the influence of confounding factors, covariate adjustment is crucial for determining the causal effect of treatment.</p><p><strong>Methods: </strong>We construct an RMTL estimator after adjusting for covariates based on the inverse probability weighting method, and derive the variance to construct interval estimates based on the large sample properties. We use simulation studies to study the statistical performance of this estimator in various scenarios. In addition, we further consider the changes in treatment effects over time, constructing a dynamic RMTL difference curve and corresponding confidence bands for the curve.</p><p><strong>Results: </strong>The simulation results demonstrate that the adjusted RMTL estimator exhibits smaller biases compared with unadjusted RMTL and provides robust interval estimates in all scenarios. This method was applied to a real-world cervical cancer patient data, revealing improvements in the prognosis of patients with small cell carcinoma of the cervix. The results showed that the protective effect of surgery was significant only in the first 20 months, but the long-term effect was not obvious. Radiotherapy significantly improved patient outcomes during the follow-up period from 17 to 57 months, while radiotherapy combined with chemotherapy significantly improved patient outcomes throughout the entire period.</p><p><strong>Conclusions: </strong>We propose the approach that is easy to interpret and implement for assessing treatment effects in observational competing risk data.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"186"},"PeriodicalIF":3.9,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11346024/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142072026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}