对术中低血压的可靠预测：基于深度学习和map衍生方法的跨中心评估。

IF 2.2 3区医学 Q2 ANESTHESIOLOGY

Journal of Clinical Monitoring and Computing Pub Date : 2025-09-12 DOI:10.1007/s10877-025-01357-0

Nada Chaari, Greg Winski, Magnus Hallbäck, Niclas Lundström, Håkan Björne, Martin Jacobsson

{"title":"对术中低血压的可靠预测：基于深度学习和map衍生方法的跨中心评估。","authors":"Nada Chaari, Greg Winski, Magnus Hallbäck, Niclas Lundström, Håkan Björne, Martin Jacobsson","doi":"10.1007/s10877-025-01357-0","DOIUrl":null,"url":null,"abstract":"Intraoperative hypotension (IOH) is associated with an increased risk of heart and kidney complications. Although AI tools aim to predict IOH, their real-world reliability is often overstated due to biased data selection. This study introduces a framework to enhance reliability by: (1) including borderline blood pressure cases (65-75 mmHg, the \"Gray Zone\"), (2) comparing AI model to simple blood pressure threshold, and (3) validating across diverse surgical cohorts, centers and demographics. Using datasets from Karolinska University Hospital (Sweden) and VitalDB (Korea), we found AI model performs better than MAP threshold method in more ambiguous cases. In contrast, when hypotensive and non-hypotensive cases had clearly separated MAP values, both methods performed similarly well. Cross-validation revealed asymmetric generalizability: models trained on datasets containing more borderline (Gray Zone) cases generalized better to datasets with clearer class separation, whereas the reverse struggled. To ensure fair model comparison and reduce dataset-specific bias, we standardized the MAP difference between positive (hypotension) and negative (non-hypotension) samples at the time of prediction. This virtually eliminated the class separation and demonstrated that inflated performance in some datasets can be attributed to selection bias rather than true model generalizability. Age also influenced generalization: Cross-age validation revealed models trained on older patients generalized better to younger cohorts, whereas differences in ASA classification had minimal effect. These findings highlight the need for realistic validation to bridge the gap between AI research and clinical practice.","PeriodicalId":15513,"journal":{"name":"Journal of Clinical Monitoring and Computing","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards reliable prediction of intraoperative hypotension: a cross-center evaluation of deep learning-based and MAP-derived methods.\",\"authors\":\"Nada Chaari, Greg Winski, Magnus Hallbäck, Niclas Lundström, Håkan Björne, Martin Jacobsson\",\"doi\":\"10.1007/s10877-025-01357-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Intraoperative hypotension (IOH) is associated with an increased risk of heart and kidney complications. Although AI tools aim to predict IOH, their real-world reliability is often overstated due to biased data selection. This study introduces a framework to enhance reliability by: (1) including borderline blood pressure cases (65-75 mmHg, the \\\"Gray Zone\\\"), (2) comparing AI model to simple blood pressure threshold, and (3) validating across diverse surgical cohorts, centers and demographics. Using datasets from Karolinska University Hospital (Sweden) and VitalDB (Korea), we found AI model performs better than MAP threshold method in more ambiguous cases. In contrast, when hypotensive and non-hypotensive cases had clearly separated MAP values, both methods performed similarly well. Cross-validation revealed asymmetric generalizability: models trained on datasets containing more borderline (Gray Zone) cases generalized better to datasets with clearer class separation, whereas the reverse struggled. To ensure fair model comparison and reduce dataset-specific bias, we standardized the MAP difference between positive (hypotension) and negative (non-hypotension) samples at the time of prediction. This virtually eliminated the class separation and demonstrated that inflated performance in some datasets can be attributed to selection bias rather than true model generalizability. Age also influenced generalization: Cross-age validation revealed models trained on older patients generalized better to younger cohorts, whereas differences in ASA classification had minimal effect. These findings highlight the need for realistic validation to bridge the gap between AI research and clinical practice.\",\"PeriodicalId\":15513,\"journal\":{\"name\":\"Journal of Clinical Monitoring and Computing\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Clinical Monitoring and Computing\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s10877-025-01357-0\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ANESTHESIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Monitoring and Computing","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10877-025-01357-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

术中低血压（IOH）与心脏和肾脏并发症的风险增加有关。尽管人工智能工具旨在预测IOH，但由于有偏见的数据选择，它们在现实世界中的可靠性往往被夸大了。本研究引入了一个框架，通过以下方式提高可靠性：(1)纳入临界血压病例（65-75 mmHg，“灰色地带”），(2)将人工智能模型与简单的血压阈值进行比较，(3)在不同的手术队列、中心和人口统计学中进行验证。使用来自瑞典卡罗林斯卡大学医院和韩国VitalDB的数据集，我们发现AI模型在更模糊的情况下比MAP阈值方法表现更好。相比之下，当低血压和非低血压患者MAP值明显分开时，两种方法的效果相似。交叉验证揭示了不对称的泛化性：在包含更多边缘（灰色地带）案例的数据集上训练的模型更好地泛化到具有更清晰类别分离的数据集，而反过来则很难。为了确保公平的模型比较和减少数据集特定的偏差，我们在预测时标准化了阳性（低血压）和阴性（非低血压）样本之间的MAP差异。这实际上消除了类分离，并证明了在某些数据集中膨胀的性能可以归因于选择偏差，而不是真正的模型泛化性。年龄也影响泛化：跨年龄验证显示，在老年患者中训练的模型更好地泛化到年轻人群，而ASA分类的差异影响很小。这些发现强调了需要进行现实验证，以弥合人工智能研究与临床实践之间的差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards reliable prediction of intraoperative hypotension: a cross-center evaluation of deep learning-based and MAP-derived methods.

Intraoperative hypotension (IOH) is associated with an increased risk of heart and kidney complications. Although AI tools aim to predict IOH, their real-world reliability is often overstated due to biased data selection. This study introduces a framework to enhance reliability by: (1) including borderline blood pressure cases (65-75 mmHg, the "Gray Zone"), (2) comparing AI model to simple blood pressure threshold, and (3) validating across diverse surgical cohorts, centers and demographics. Using datasets from Karolinska University Hospital (Sweden) and VitalDB (Korea), we found AI model performs better than MAP threshold method in more ambiguous cases. In contrast, when hypotensive and non-hypotensive cases had clearly separated MAP values, both methods performed similarly well. Cross-validation revealed asymmetric generalizability: models trained on datasets containing more borderline (Gray Zone) cases generalized better to datasets with clearer class separation, whereas the reverse struggled. To ensure fair model comparison and reduce dataset-specific bias, we standardized the MAP difference between positive (hypotension) and negative (non-hypotension) samples at the time of prediction. This virtually eliminated the class separation and demonstrated that inflated performance in some datasets can be attributed to selection bias rather than true model generalizability. Age also influenced generalization: Cross-age validation revealed models trained on older patients generalized better to younger cohorts, whereas differences in ASA classification had minimal effect. These findings highlight the need for realistic validation to bridge the gap between AI research and clinical practice.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Clinical Monitoring and Computing ANESTHESIOLOGY-

CiteScore

4.30

自引率

13.60%

发文量

144

审稿时长

6-12 weeks

期刊介绍： The Journal of Clinical Monitoring and Computing is a clinical journal publishing papers related to technology in the fields of anaesthesia, intensive care medicine, emergency medicine, and peri-operative medicine. The journal has links with numerous specialist societies, including editorial board representatives from the European Society for Computing and Technology in Anaesthesia and Intensive Care (ESCTAIC), the Society for Technology in Anesthesia (STA), the Society for Complex Acute Illness (SCAI) and the NAVAt (NAVigating towards your Anaestheisa Targets) group. The journal publishes original papers, narrative and systematic reviews, technological notes, letters to the editor, editorial or commentary papers, and policy statements or guidelines from national or international societies. The journal encourages debate on published papers and technology, including letters commenting on previous publications or technological concerns. The journal occasionally publishes special issues with technological or clinical themes, or reports and abstracts from scientificmeetings. Special issues proposals should be sent to the Editor-in-Chief. Specific details of types of papers, and the clinical and technological content of papers considered within scope can be found in instructions for authors.