{"title":"Risk-based evaluation of machine learning-based classification methods used for medical devices.","authors":"Martin Haimerl, Christoph Reich","doi":"10.1186/s12911-025-02909-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In the future, more medical devices will be based on machine learning (ML) methods. In general, the consideration of risks is a crucial aspect for evaluating medical devices. Accordingly, risks and their associated costs should be taken into account when assessing the performance of ML-based medical devices. This paper addresses the following three research questions towards a risk-based evaluation with a focus on ML-based classification models.</p><p><strong>Methods: </strong>First, we analyzed how often risk-based metrics are currently utilized in the context of ML-based classification models. This was performed using a literature research based on a sample of recent scientific publications. Second, we introduce an approach for evaluating such models where expected risks and associated costs are integrated into the corresponding performance metrics. Additionally, we analyze the impact of different risk ratios on the resulting overall performance. Third, we elaborate how such risk-based approaches relate to regulatory requirements in the field of medical devices. A set of use case scenarios were utilized to demonstrate necessities and practical implications, in this regard.</p><p><strong>Results: </strong>First, it was shown that currently most scientific publications do not include risk-based approaches for measuring performance. Second, it was demonstrated that risk-based considerations have a substantial impact on the outcome. The relative increase of the resulting overall risks can go up to 196% when the ratio between different types of risks (false negatives vs. false positives) changes by a factor of 10.0. Third, we elaborated that risk-based considerations need to be included into the assessment of ML-based medical devices, according to the relevant EU regulations and standards. In particular, this applies when a substantial impact on the clinical outcome / in terms of the risk-benefit relationship occurs.</p><p><strong>Conclusion: </strong>In summary, we demonstrated the necessity of a risk-based approach for the evaluation of medical devices which include ML-based classification methods. We showed that currently many scientific papers in this area do not include risk considerations. We developed basic steps towards a risk-based assessment of ML-based classifiers and elaborated consequences that could occur, when these steps are neglected. And, we demonstrated the consistency of our approach with current regulatory requirements in the EU.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"126"},"PeriodicalIF":3.3000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11895222/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-02909-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: In the future, more medical devices will be based on machine learning (ML) methods. In general, the consideration of risks is a crucial aspect for evaluating medical devices. Accordingly, risks and their associated costs should be taken into account when assessing the performance of ML-based medical devices. This paper addresses the following three research questions towards a risk-based evaluation with a focus on ML-based classification models.
Methods: First, we analyzed how often risk-based metrics are currently utilized in the context of ML-based classification models. This was performed using a literature research based on a sample of recent scientific publications. Second, we introduce an approach for evaluating such models where expected risks and associated costs are integrated into the corresponding performance metrics. Additionally, we analyze the impact of different risk ratios on the resulting overall performance. Third, we elaborate how such risk-based approaches relate to regulatory requirements in the field of medical devices. A set of use case scenarios were utilized to demonstrate necessities and practical implications, in this regard.
Results: First, it was shown that currently most scientific publications do not include risk-based approaches for measuring performance. Second, it was demonstrated that risk-based considerations have a substantial impact on the outcome. The relative increase of the resulting overall risks can go up to 196% when the ratio between different types of risks (false negatives vs. false positives) changes by a factor of 10.0. Third, we elaborated that risk-based considerations need to be included into the assessment of ML-based medical devices, according to the relevant EU regulations and standards. In particular, this applies when a substantial impact on the clinical outcome / in terms of the risk-benefit relationship occurs.
Conclusion: In summary, we demonstrated the necessity of a risk-based approach for the evaluation of medical devices which include ML-based classification methods. We showed that currently many scientific papers in this area do not include risk considerations. We developed basic steps towards a risk-based assessment of ML-based classifiers and elaborated consequences that could occur, when these steps are neglected. And, we demonstrated the consistency of our approach with current regulatory requirements in the EU.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.