A machine learning framework to adjust for learning effects in medical device safety evaluation.

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association Pub Date : 2025-01-01 DOI:10.1093/jamia/ocae273

Jejo D Koola, Karthik Ramesh, Jialin Mao, Minyoung Ahn, Sharon E Davis, Usha Govindarajulu, Amy M Perkins, Dax Westerman, Henry Ssemaganda, Theodore Speroff, Lucila Ohno-Machado, Craig R Ramsay, Art Sedrakyan, Frederic S Resnic, Michael E Matheny

{"title":"A machine learning framework to adjust for learning effects in medical device safety evaluation.","authors":"Jejo D Koola, Karthik Ramesh, Jialin Mao, Minyoung Ahn, Sharon E Davis, Usha Govindarajulu, Amy M Perkins, Dax Westerman, Henry Ssemaganda, Theodore Speroff, Lucila Ohno-Machado, Craig R Ramsay, Art Sedrakyan, Frederic S Resnic, Michael E Matheny","doi":"10.1093/jamia/ocae273","DOIUrl":null,"url":null,"abstract":"Objectives: Traditional methods for medical device post-market surveillance often fail to accurately account for operator learning effects, leading to biased assessments of device safety. These methods struggle with non-linearity, complex learning curves, and time-varying covariates, such as physician experience. To address these limitations, we sought to develop a machine learning (ML) framework to detect and adjust for operator learning effects.Materials and methods: A gradient-boosted decision tree ML method was used to analyze synthetic datasets that replicate the complexity of clinical scenarios involving high-risk medical devices. We designed this process to detect learning effects using a risk-adjusted cumulative sum method, quantify the excess adverse event rate attributable to operator inexperience, and adjust for these alongside patient factors in evaluating device safety signals. To maintain integrity, we employed blinding between data generation and analysis teams. Synthetic data used underlying distributions and patient feature correlations based on clinical data from the Department of Veterans Affairs between 2005 and 2012. We generated 2494 synthetic datasets with widely varying characteristics including number of patient features, operators and institutions, and the operator learning form. Each dataset contained a hypothetical study device, Device B, and a reference device, Device A. We evaluated accuracy in identifying learning effects and identifying and estimating the strength of the device safety signal. Our approach also evaluated different clinically relevant thresholds for safety signal detection.Results: Our framework accurately identified the presence or absence of learning effects in 93.6% of datasets and correctly determined device safety signals in 93.4% of cases. The estimated device odds ratios' 95% confidence intervals were accurately aligned with the specified ratios in 94.7% of datasets. In contrast, a comparative model excluding operator learning effects significantly underperformed in detecting device signals and in accuracy. Notably, our framework achieved 100% specificity for clinically relevant safety signal thresholds, although sensitivity varied with the threshold applied.Discussion: A machine learning framework, tailored for the complexities of post-market device evaluation, may provide superior performance compared to standard parametric techniques when operator learning is present.Conclusion: Demonstrating the capacity of ML to overcome complex evaluative challenges, our framework addresses the limitations of traditional statistical methods in current post-market surveillance processes. By offering a reliable means to detect and adjust for learning effects, it may significantly improve medical device safety evaluation.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"206-217"},"PeriodicalIF":4.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648715/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae273","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Traditional methods for medical device post-market surveillance often fail to accurately account for operator learning effects, leading to biased assessments of device safety. These methods struggle with non-linearity, complex learning curves, and time-varying covariates, such as physician experience. To address these limitations, we sought to develop a machine learning (ML) framework to detect and adjust for operator learning effects.

Materials and methods: A gradient-boosted decision tree ML method was used to analyze synthetic datasets that replicate the complexity of clinical scenarios involving high-risk medical devices. We designed this process to detect learning effects using a risk-adjusted cumulative sum method, quantify the excess adverse event rate attributable to operator inexperience, and adjust for these alongside patient factors in evaluating device safety signals. To maintain integrity, we employed blinding between data generation and analysis teams. Synthetic data used underlying distributions and patient feature correlations based on clinical data from the Department of Veterans Affairs between 2005 and 2012. We generated 2494 synthetic datasets with widely varying characteristics including number of patient features, operators and institutions, and the operator learning form. Each dataset contained a hypothetical study device, Device B, and a reference device, Device A. We evaluated accuracy in identifying learning effects and identifying and estimating the strength of the device safety signal. Our approach also evaluated different clinically relevant thresholds for safety signal detection.

Results: Our framework accurately identified the presence or absence of learning effects in 93.6% of datasets and correctly determined device safety signals in 93.4% of cases. The estimated device odds ratios' 95% confidence intervals were accurately aligned with the specified ratios in 94.7% of datasets. In contrast, a comparative model excluding operator learning effects significantly underperformed in detecting device signals and in accuracy. Notably, our framework achieved 100% specificity for clinically relevant safety signal thresholds, although sensitivity varied with the threshold applied.

Discussion: A machine learning framework, tailored for the complexities of post-market device evaluation, may provide superior performance compared to standard parametric techniques when operator learning is present.

Conclusion: Demonstrating the capacity of ML to overcome complex evaluative challenges, our framework addresses the limitations of traditional statistical methods in current post-market surveillance processes. By offering a reliable means to detect and adjust for learning effects, it may significantly improve medical device safety evaluation.

查看原文本刊更多论文

在医疗器械安全评估中调整学习效果的机器学习框架。

目标：医疗器械上市后监督的传统方法往往无法准确考虑操作者的学习效应，导致对器械安全性的评估出现偏差。这些方法难以应对非线性、复杂的学习曲线和时变协变量（如医生经验）。为了解决这些局限性，我们试图开发一种机器学习（ML）框架来检测和调整操作者的学习效果：我们使用梯度提升决策树 ML 方法来分析合成数据集，这些数据集复制了涉及高风险医疗设备的临床场景的复杂性。我们设计了这一流程，以使用风险调整累积和法检测学习效应，量化因操作者经验不足而导致的超额不良事件率，并在评估器械安全信号时将这些因素与患者因素一并考虑。为了保持完整性，我们在数据生成和分析团队之间采用了盲法。合成数据使用了基于退伍军人事务部 2005 年至 2012 年临床数据的基础分布和患者特征相关性。我们生成了 2494 个合成数据集，这些数据集的特征千差万别，包括患者特征数量、操作者和机构以及操作者学习形式。我们评估了识别学习效应以及识别和估计设备安全信号强度的准确性。我们的方法还评估了安全信号检测的不同临床相关阈值：我们的框架在 93.6% 的数据集中准确识别了学习效应的存在与否，并在 93.4% 的案例中正确确定了设备安全信号。在 94.7% 的数据集中，估计设备几率的 95% 置信区间与指定几率准确一致。相比之下，排除了操作员学习效应的比较模型在检测设备信号和准确性方面明显表现不佳。值得注意的是，我们的框架对临床相关安全信号阈值的特异性达到了 100%，但灵敏度随应用的阈值而变化：讨论：针对上市后设备评估的复杂性而定制的机器学习框架，在操作者学习的情况下，可能比标准参数技术提供更优越的性能：我们的框架展示了机器学习克服复杂评估挑战的能力，解决了当前上市后监督流程中传统统计方法的局限性。通过提供检测和调整学习效应的可靠方法，它可以显著改善医疗设备的安全性评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the American Medical Informatics Association 医学-计算机：跨学科应用

CiteScore

14.50

自引率

7.80%

发文量

230

审稿时长

3-8 weeks

期刊介绍： JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.