Christel Sirocchi, Alessandro Bogliolo, Sara Montagna
{"title":"Medical-informed machine learning: integrating prior knowledge into medical decision systems.","authors":"Christel Sirocchi, Alessandro Bogliolo, Sara Montagna","doi":"10.1186/s12911-024-02582-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Clinical medicine offers a promising arena for applying Machine Learning (ML) models. However, despite numerous studies employing ML in medical data analysis, only a fraction have impacted clinical care. This article underscores the importance of utilising ML in medical data analysis, recognising that ML alone may not adequately capture the full complexity of clinical data, thereby advocating for the integration of medical domain knowledge in ML.</p><p><strong>Methods: </strong>The study conducts a comprehensive review of prior efforts in integrating medical knowledge into ML and maps these integration strategies onto the phases of the ML pipeline, encompassing data pre-processing, feature engineering, model training, and output evaluation. The study further explores the significance and impact of such integration through a case study on diabetes prediction. Here, clinical knowledge, encompassing rules, causal networks, intervals, and formulas, is integrated at each stage of the ML pipeline, resulting in a spectrum of integrated models.</p><p><strong>Results: </strong>The findings highlight the benefits of integration in terms of accuracy, interpretability, data efficiency, and adherence to clinical guidelines. In several cases, integrated models outperformed purely data-driven approaches, underscoring the potential for domain knowledge to enhance ML models through improved generalisation. In other cases, the integration was instrumental in enhancing model interpretability and ensuring conformity with established clinical guidelines. Notably, knowledge integration also proved effective in maintaining performance under limited data scenarios.</p><p><strong>Conclusions: </strong>By illustrating various integration strategies through a clinical case study, this work provides guidance to inspire and facilitate future integration efforts. Furthermore, the study identifies the need to refine domain knowledge representation and fine-tune its contribution to the ML model as the two main challenges to integration and aims to stimulate further research in this direction.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11212227/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02582-4","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Clinical medicine offers a promising arena for applying Machine Learning (ML) models. However, despite numerous studies employing ML in medical data analysis, only a fraction have impacted clinical care. This article underscores the importance of utilising ML in medical data analysis, recognising that ML alone may not adequately capture the full complexity of clinical data, thereby advocating for the integration of medical domain knowledge in ML.
Methods: The study conducts a comprehensive review of prior efforts in integrating medical knowledge into ML and maps these integration strategies onto the phases of the ML pipeline, encompassing data pre-processing, feature engineering, model training, and output evaluation. The study further explores the significance and impact of such integration through a case study on diabetes prediction. Here, clinical knowledge, encompassing rules, causal networks, intervals, and formulas, is integrated at each stage of the ML pipeline, resulting in a spectrum of integrated models.
Results: The findings highlight the benefits of integration in terms of accuracy, interpretability, data efficiency, and adherence to clinical guidelines. In several cases, integrated models outperformed purely data-driven approaches, underscoring the potential for domain knowledge to enhance ML models through improved generalisation. In other cases, the integration was instrumental in enhancing model interpretability and ensuring conformity with established clinical guidelines. Notably, knowledge integration also proved effective in maintaining performance under limited data scenarios.
Conclusions: By illustrating various integration strategies through a clinical case study, this work provides guidance to inspire and facilitate future integration efforts. Furthermore, the study identifies the need to refine domain knowledge representation and fine-tune its contribution to the ML model as the two main challenges to integration and aims to stimulate further research in this direction.
背景:临床医学为应用机器学习(ML)模型提供了广阔的前景。然而,尽管在医学数据分析中采用 ML 的研究不胜枚举,但只有一小部分研究对临床护理产生了影响。本文强调了在医学数据分析中使用 ML 的重要性,认识到仅靠 ML 可能无法充分捕捉临床数据的全部复杂性,因此主张在 ML 中整合医学领域知识:本研究全面回顾了之前将医学知识整合到 ML 中的工作,并将这些整合策略映射到 ML 管道的各个阶段,包括数据预处理、特征工程、模型训练和输出评估。本研究通过糖尿病预测案例研究进一步探讨了这种集成的意义和影响。在这里,临床知识(包括规则、因果网络、区间和公式)在人工智能管道的每个阶段都得到了整合,从而产生了一系列整合模型:结果:研究结果凸显了集成模型在准确性、可解释性、数据效率和遵守临床指南方面的优势。在一些情况下,集成模型的表现优于纯数据驱动的方法,这凸显了领域知识通过提高概括性来增强 ML 模型的潜力。在其他情况下,集成有助于提高模型的可解释性,确保符合既定的临床指南。值得注意的是,知识整合在有限数据情况下也被证明能有效保持性能:本研究通过临床案例研究说明了各种整合策略,为激励和促进未来的整合工作提供了指导。此外,该研究还指出,需要完善领域知识表征并微调其对 ML 模型的贡献,这是集成工作面临的两大挑战,其目的是促进这方面的进一步研究。