Javier Muñoz MD, PhD , Rocío Ruíz-Cacho MD , Nerio José Fernández-Araujo MD , Alberto Candela MD , Lourdes Carmen Visedo MD , Javier Muñoz-Visedo Math, BsC
{"title":"成人ARDS诊断和亚表型人工智能模型的系统回顾和荟萃分析。","authors":"Javier Muñoz MD, PhD , Rocío Ruíz-Cacho MD , Nerio José Fernández-Araujo MD , Alberto Candela MD , Lourdes Carmen Visedo MD , Javier Muñoz-Visedo Math, BsC","doi":"10.1016/j.hrtlng.2025.09.017","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Artificial intelligence (AI) has emerged as a promising tool to improve the diagnosis and characterization of ARDS, including the identification of subphenotypes.</div></div><div><h3>Objectives</h3><div>To evaluate the diagnostic performance and methodological quality of AI models for identifying ARDS and its subphenotypes in adults.</div></div><div><h3>Methods</h3><div>We conducted a systematic review and meta-analysis of 63 studies (<em>n</em> = 135,762) published between 2013 and 2024 in PubMed, Embase, and the Cochrane Library. Extracted outcomes included sensitivity, specificity, AUROC, and validation methods. Risk of bias was assessed with PROBAST, and AI-specific metrics (overfitting, generalization, interpretability, discrimination, calibration) were reported.</div></div><div><h3>Results</h3><div>Pooled sensitivity was 0.89 (95 % CI 0.84–0.93), specificity 0.88 (95 % CI 0.83–0.92), and AUROC 0.90 (95 % CI 0.86–0.94), with high heterogeneity (I² > 85 %). Twenty-two studies (31 %) were rated high quality, with sensitivity 0.86 (95 % CI 0.82–0.89) and specificity 0.82 (95 % CI 0.78–0.85). Deep learning models (<em>n</em> = 14) achieved sensitivity 0.91, while machine learning models (<em>n</em> = 19) showed 0.87. Imaging-based models (<em>n</em> = 15) outperformed non-imaging approaches. COVID-19 studies (<em>n</em> = 9) reported sensitivity 0.90 with comparable AUROC and specificity. Only seven studies (18 %) investigated subphenotyping, identifying hyperinflammatory and hypoinflammatory profiles with potential therapeutic relevance. Calibration reporting was missing in 47 % and external validation in most (29/63).</div></div><div><h3>Conclusion</h3><div>AI models for ARDS demonstrate promising diagnostic accuracy but are limited by poor calibration and scarce external validation. Subphenotyping remains exploratory but suggests opportunities for real-time patient stratification. Prospective validation and standardized reporting are essential for clinical adoption.</div></div>","PeriodicalId":55064,"journal":{"name":"Heart & Lung","volume":"75 ","pages":"Pages 144-163"},"PeriodicalIF":2.6000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Systematic review and meta-analysis of artificial intelligence models for diagnosing and subphenotyping ARDS in adults\",\"authors\":\"Javier Muñoz MD, PhD , Rocío Ruíz-Cacho MD , Nerio José Fernández-Araujo MD , Alberto Candela MD , Lourdes Carmen Visedo MD , Javier Muñoz-Visedo Math, BsC\",\"doi\":\"10.1016/j.hrtlng.2025.09.017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Artificial intelligence (AI) has emerged as a promising tool to improve the diagnosis and characterization of ARDS, including the identification of subphenotypes.</div></div><div><h3>Objectives</h3><div>To evaluate the diagnostic performance and methodological quality of AI models for identifying ARDS and its subphenotypes in adults.</div></div><div><h3>Methods</h3><div>We conducted a systematic review and meta-analysis of 63 studies (<em>n</em> = 135,762) published between 2013 and 2024 in PubMed, Embase, and the Cochrane Library. Extracted outcomes included sensitivity, specificity, AUROC, and validation methods. Risk of bias was assessed with PROBAST, and AI-specific metrics (overfitting, generalization, interpretability, discrimination, calibration) were reported.</div></div><div><h3>Results</h3><div>Pooled sensitivity was 0.89 (95 % CI 0.84–0.93), specificity 0.88 (95 % CI 0.83–0.92), and AUROC 0.90 (95 % CI 0.86–0.94), with high heterogeneity (I² > 85 %). Twenty-two studies (31 %) were rated high quality, with sensitivity 0.86 (95 % CI 0.82–0.89) and specificity 0.82 (95 % CI 0.78–0.85). Deep learning models (<em>n</em> = 14) achieved sensitivity 0.91, while machine learning models (<em>n</em> = 19) showed 0.87. Imaging-based models (<em>n</em> = 15) outperformed non-imaging approaches. COVID-19 studies (<em>n</em> = 9) reported sensitivity 0.90 with comparable AUROC and specificity. Only seven studies (18 %) investigated subphenotyping, identifying hyperinflammatory and hypoinflammatory profiles with potential therapeutic relevance. Calibration reporting was missing in 47 % and external validation in most (29/63).</div></div><div><h3>Conclusion</h3><div>AI models for ARDS demonstrate promising diagnostic accuracy but are limited by poor calibration and scarce external validation. Subphenotyping remains exploratory but suggests opportunities for real-time patient stratification. Prospective validation and standardized reporting are essential for clinical adoption.</div></div>\",\"PeriodicalId\":55064,\"journal\":{\"name\":\"Heart & Lung\",\"volume\":\"75 \",\"pages\":\"Pages 144-163\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Heart & Lung\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0147956325002067\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Heart & Lung","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0147956325002067","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
背景:人工智能(AI)已成为一种有前途的工具,可以改善ARDS的诊断和表征,包括亚表型的识别。目的:评价人工智能模型识别成人ARDS及其亚表型的诊断性能和方法质量。方法:我们对2013年至2024年间发表在PubMed、Embase和Cochrane图书馆的63项研究(n = 135,762)进行了系统回顾和荟萃分析。提取结果包括敏感性、特异性、AUROC和验证方法。用PROBAST评估偏倚风险,并报告人工智能特定指标(过拟合、泛化、可解释性、歧视、校准)。结果:合并敏感性为0.89 (95% CI 0.84-0.93),特异性为0.88 (95% CI 0.83-0.92), AUROC为0.90 (95% CI 0.86-0.94),异质性较高(I²> 85%)。22项研究(31%)被评为高质量,敏感性为0.86 (95% CI 0.82-0.89),特异性为0.82 (95% CI 0.78-0.85)。深度学习模型(n = 14)的灵敏度为0.91,机器学习模型(n = 19)的灵敏度为0.87。基于成像的模型(n = 15)优于非成像方法。COVID-19研究(n = 9)报告敏感性0.90,AUROC和特异性相当。只有7项研究(18%)调查了亚表型,确定了具有潜在治疗相关性的高炎症和低炎症特征。47%的人缺少校准报告,大多数人(29/63)缺少外部验证。结论:人工智能模型对ARDS的诊断具有良好的准确性,但由于校准不良和缺乏外部验证而受到限制。亚表型仍然是探索性的,但为实时患者分层提供了机会。前瞻性验证和标准化报告对临床应用至关重要。
Systematic review and meta-analysis of artificial intelligence models for diagnosing and subphenotyping ARDS in adults
Background
Artificial intelligence (AI) has emerged as a promising tool to improve the diagnosis and characterization of ARDS, including the identification of subphenotypes.
Objectives
To evaluate the diagnostic performance and methodological quality of AI models for identifying ARDS and its subphenotypes in adults.
Methods
We conducted a systematic review and meta-analysis of 63 studies (n = 135,762) published between 2013 and 2024 in PubMed, Embase, and the Cochrane Library. Extracted outcomes included sensitivity, specificity, AUROC, and validation methods. Risk of bias was assessed with PROBAST, and AI-specific metrics (overfitting, generalization, interpretability, discrimination, calibration) were reported.
Results
Pooled sensitivity was 0.89 (95 % CI 0.84–0.93), specificity 0.88 (95 % CI 0.83–0.92), and AUROC 0.90 (95 % CI 0.86–0.94), with high heterogeneity (I² > 85 %). Twenty-two studies (31 %) were rated high quality, with sensitivity 0.86 (95 % CI 0.82–0.89) and specificity 0.82 (95 % CI 0.78–0.85). Deep learning models (n = 14) achieved sensitivity 0.91, while machine learning models (n = 19) showed 0.87. Imaging-based models (n = 15) outperformed non-imaging approaches. COVID-19 studies (n = 9) reported sensitivity 0.90 with comparable AUROC and specificity. Only seven studies (18 %) investigated subphenotyping, identifying hyperinflammatory and hypoinflammatory profiles with potential therapeutic relevance. Calibration reporting was missing in 47 % and external validation in most (29/63).
Conclusion
AI models for ARDS demonstrate promising diagnostic accuracy but are limited by poor calibration and scarce external validation. Subphenotyping remains exploratory but suggests opportunities for real-time patient stratification. Prospective validation and standardized reporting are essential for clinical adoption.
期刊介绍:
Heart & Lung: The Journal of Cardiopulmonary and Acute Care, the official publication of The American Association of Heart Failure Nurses, presents original, peer-reviewed articles on techniques, advances, investigations, and observations related to the care of patients with acute and critical illness and patients with chronic cardiac or pulmonary disorders.
The Journal''s acute care articles focus on the care of hospitalized patients, including those in the critical and acute care settings. Because most patients who are hospitalized in acute and critical care settings have chronic conditions, we are also interested in the chronically critically ill, the care of patients with chronic cardiopulmonary disorders, their rehabilitation, and disease prevention. The Journal''s heart failure articles focus on all aspects of the care of patients with this condition. Manuscripts that are relevant to populations across the human lifespan are welcome.