可解释的机器学习模型用于肺动脉高压风险预测:回顾性队列研究。

IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS
Hongxia Jiang, Han Gao, Dexin Wang, Qingli Zeng, Xiaojun Hao, Zhenshun Cheng
{"title":"可解释的机器学习模型用于肺动脉高压风险预测:回顾性队列研究。","authors":"Hongxia Jiang, Han Gao, Dexin Wang, Qingli Zeng, Xiaojun Hao, Zhenshun Cheng","doi":"10.2196/74117","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pulmonary hypertension (PH) is a progressive disorder characterized by elevated pulmonary artery pressure and increased pulmonary vascular resistance, ultimately leading to right heart failure. Early detection is critical for improving patient outcomes.</p><p><strong>Objective: </strong>The diagnosis of PH primarily relies on right heart catheterization, but its invasive nature significantly limits its clinical use. Echocardiography, as the most common noninvasive screening and diagnostic tool for PH, provides valuable patient information. This study aims to identify key PH predictors from echocardiographic parameters, laboratory tests, and demographic data using machine learning, ultimately constructing a predictive model to support early noninvasive diagnosis of PH.</p><p><strong>Methods: </strong>This study compiled comprehensive datasets comprising echocardiography measurements, clinical laboratory data, and fundamental demographic information from patients with PH and matched controls. The final analytical cohort consisted of 895 participants with 85 evaluated variables. Recursive feature elimination was used to select the most relevant echocardiographic variables, which were subsequently integrated into a composite ultrasound index using machine learning techniques, XGBoost (Extreme Gradient Boosting). LASSO (least absolute shrinkage and selection operator) regression was applied to select the potential predictive variable from laboratory tests. Then, the ultrasound index variables and selected laboratory tests were combined to construct a logistic regression model for the predictive diagnosis of PH. The model's performance was rigorously evaluated using receiver operating characteristic curves, calibration plots, and decision curve analysis to ensure its clinical relevance and accuracy. Both internal and external validation were used to assess the performance of the constructed model.</p><p><strong>Results: </strong>A total of 16 echocardiographic parameters (right atrium diameter, pulmonary artery diameter, left atrium diameter, tricuspid valve reflux degree, right ventricular diameter, E/E' [ratio of mitral valve early diastolic inflow velocity (E) to mitral annulus early diastolic velocity (E')], interventricular septal thickness, left ventricular diameter, ascending aortic diameter, left ventricular ejection fraction, left ventricular outflow tract velocity, mitral valve reflux degree, pulmonary valve outflow velocity, mitral valve inflow velocity, aortic valve reflux degree, and left ventricular posterior wall thickness) combined with 2 laboratory biomarkers (prothrombin time activity and cystatin C) were identified as optimal predictors, forming a high-performance PH prediction model. The diagnostic model demonstrated high predictive accuracy, with an area under the receiver operating characteristic curve of 0.997 in the internal validation and 0.974 in the external validation. Both calibration plots and decision curve analysis validated the model's predictive accuracy and clinical applicability, with optimal performance observed at higher risk stratification cutoffs.</p><p><strong>Conclusions: </strong>This model enhances early PH diagnosis through a noninvasive approach and demonstrates strong predictive accuracy. It facilitates early intervention and personalized treatment, with potential applications in broader cardiovascular disease management.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e74117"},"PeriodicalIF":3.8000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12459742/pdf/","citationCount":"0","resultStr":"{\"title\":\"Interpretable Machine Learning Model for Pulmonary Hypertension Risk Prediction: Retrospective Cohort Study.\",\"authors\":\"Hongxia Jiang, Han Gao, Dexin Wang, Qingli Zeng, Xiaojun Hao, Zhenshun Cheng\",\"doi\":\"10.2196/74117\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Pulmonary hypertension (PH) is a progressive disorder characterized by elevated pulmonary artery pressure and increased pulmonary vascular resistance, ultimately leading to right heart failure. Early detection is critical for improving patient outcomes.</p><p><strong>Objective: </strong>The diagnosis of PH primarily relies on right heart catheterization, but its invasive nature significantly limits its clinical use. Echocardiography, as the most common noninvasive screening and diagnostic tool for PH, provides valuable patient information. This study aims to identify key PH predictors from echocardiographic parameters, laboratory tests, and demographic data using machine learning, ultimately constructing a predictive model to support early noninvasive diagnosis of PH.</p><p><strong>Methods: </strong>This study compiled comprehensive datasets comprising echocardiography measurements, clinical laboratory data, and fundamental demographic information from patients with PH and matched controls. The final analytical cohort consisted of 895 participants with 85 evaluated variables. Recursive feature elimination was used to select the most relevant echocardiographic variables, which were subsequently integrated into a composite ultrasound index using machine learning techniques, XGBoost (Extreme Gradient Boosting). LASSO (least absolute shrinkage and selection operator) regression was applied to select the potential predictive variable from laboratory tests. Then, the ultrasound index variables and selected laboratory tests were combined to construct a logistic regression model for the predictive diagnosis of PH. The model's performance was rigorously evaluated using receiver operating characteristic curves, calibration plots, and decision curve analysis to ensure its clinical relevance and accuracy. Both internal and external validation were used to assess the performance of the constructed model.</p><p><strong>Results: </strong>A total of 16 echocardiographic parameters (right atrium diameter, pulmonary artery diameter, left atrium diameter, tricuspid valve reflux degree, right ventricular diameter, E/E' [ratio of mitral valve early diastolic inflow velocity (E) to mitral annulus early diastolic velocity (E')], interventricular septal thickness, left ventricular diameter, ascending aortic diameter, left ventricular ejection fraction, left ventricular outflow tract velocity, mitral valve reflux degree, pulmonary valve outflow velocity, mitral valve inflow velocity, aortic valve reflux degree, and left ventricular posterior wall thickness) combined with 2 laboratory biomarkers (prothrombin time activity and cystatin C) were identified as optimal predictors, forming a high-performance PH prediction model. The diagnostic model demonstrated high predictive accuracy, with an area under the receiver operating characteristic curve of 0.997 in the internal validation and 0.974 in the external validation. Both calibration plots and decision curve analysis validated the model's predictive accuracy and clinical applicability, with optimal performance observed at higher risk stratification cutoffs.</p><p><strong>Conclusions: </strong>This model enhances early PH diagnosis through a noninvasive approach and demonstrates strong predictive accuracy. It facilitates early intervention and personalized treatment, with potential applications in broader cardiovascular disease management.</p>\",\"PeriodicalId\":56334,\"journal\":{\"name\":\"JMIR Medical Informatics\",\"volume\":\"13 \",\"pages\":\"e74117\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12459742/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Medical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/74117\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/74117","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

背景:肺动脉高压(Pulmonary hypertension, PH)是一种以肺动脉压升高和肺血管阻力增加为特征的进行性疾病,最终导致右心衰。早期发现对改善患者预后至关重要。目的:PH的诊断主要依靠右心导管,但其侵入性极大地限制了其临床应用。超声心动图作为PH最常见的无创筛查和诊断工具,提供了有价值的患者信息。本研究旨在利用机器学习从超声心动图参数、实验室检查和人口统计数据中确定关键的PH预测因子,最终构建一个预测模型,以支持PH的早期无创诊断。方法:本研究编制了全面的数据集,包括超声心动图测量、临床实验室数据以及PH患者和匹配对照组的基本人口统计信息。最终的分析队列包括895名参与者和85个评估变量。使用递归特征消去来选择最相关的超声心动图变量,随后使用机器学习技术XGBoost(极端梯度增强)将其整合到复合超声指数中。LASSO(最小绝对收缩和选择算子)回归应用于从实验室测试中选择潜在的预测变量。然后,将超声指标变量与选定的实验室检查相结合,构建ph预测诊断的logistic回归模型。通过受试者工作特征曲线、校准图和决策曲线分析对模型的性能进行严格评估,以确保模型的临床相关性和准确性。使用内部和外部验证来评估构建模型的性能。结果:超声心动图共16项参数(右心房直径、肺动脉直径、左心房直径、三尖瓣反流度、右心室直径、E/E′[二尖瓣舒张早期流入速度(E)与二尖瓣环舒张早期速度(E′)之比]、室间隔厚度、左室直径、升主动脉直径、左室射血分数、左室流出道速度、二尖瓣反流度、肺动脉瓣流出速度、二尖瓣流入速度、主动脉瓣反流程度、左心室后壁厚度)结合2个实验室生物标志物(凝血酶原时间活性和胱抑素C)被确定为最佳预测因子,形成高性能的PH预测模型。该诊断模型具有较高的预测准确率,内部验证的受试者工作特征曲线下面积为0.997,外部验证的受试者工作特征曲线下面积为0.974。校准图和决策曲线分析均验证了模型的预测准确性和临床适用性,在较高的风险分层截止点处表现最佳。结论:该模型通过无创方法提高了早期PH诊断,并显示出很强的预测准确性。它有助于早期干预和个性化治疗,在更广泛的心血管疾病管理中具有潜在的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Interpretable Machine Learning Model for Pulmonary Hypertension Risk Prediction: Retrospective Cohort Study.

Background: Pulmonary hypertension (PH) is a progressive disorder characterized by elevated pulmonary artery pressure and increased pulmonary vascular resistance, ultimately leading to right heart failure. Early detection is critical for improving patient outcomes.

Objective: The diagnosis of PH primarily relies on right heart catheterization, but its invasive nature significantly limits its clinical use. Echocardiography, as the most common noninvasive screening and diagnostic tool for PH, provides valuable patient information. This study aims to identify key PH predictors from echocardiographic parameters, laboratory tests, and demographic data using machine learning, ultimately constructing a predictive model to support early noninvasive diagnosis of PH.

Methods: This study compiled comprehensive datasets comprising echocardiography measurements, clinical laboratory data, and fundamental demographic information from patients with PH and matched controls. The final analytical cohort consisted of 895 participants with 85 evaluated variables. Recursive feature elimination was used to select the most relevant echocardiographic variables, which were subsequently integrated into a composite ultrasound index using machine learning techniques, XGBoost (Extreme Gradient Boosting). LASSO (least absolute shrinkage and selection operator) regression was applied to select the potential predictive variable from laboratory tests. Then, the ultrasound index variables and selected laboratory tests were combined to construct a logistic regression model for the predictive diagnosis of PH. The model's performance was rigorously evaluated using receiver operating characteristic curves, calibration plots, and decision curve analysis to ensure its clinical relevance and accuracy. Both internal and external validation were used to assess the performance of the constructed model.

Results: A total of 16 echocardiographic parameters (right atrium diameter, pulmonary artery diameter, left atrium diameter, tricuspid valve reflux degree, right ventricular diameter, E/E' [ratio of mitral valve early diastolic inflow velocity (E) to mitral annulus early diastolic velocity (E')], interventricular septal thickness, left ventricular diameter, ascending aortic diameter, left ventricular ejection fraction, left ventricular outflow tract velocity, mitral valve reflux degree, pulmonary valve outflow velocity, mitral valve inflow velocity, aortic valve reflux degree, and left ventricular posterior wall thickness) combined with 2 laboratory biomarkers (prothrombin time activity and cystatin C) were identified as optimal predictors, forming a high-performance PH prediction model. The diagnostic model demonstrated high predictive accuracy, with an area under the receiver operating characteristic curve of 0.997 in the internal validation and 0.974 in the external validation. Both calibration plots and decision curve analysis validated the model's predictive accuracy and clinical applicability, with optimal performance observed at higher risk stratification cutoffs.

Conclusions: This model enhances early PH diagnosis through a noninvasive approach and demonstrates strong predictive accuracy. It facilitates early intervention and personalized treatment, with potential applications in broader cardiovascular disease management.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JMIR Medical Informatics
JMIR Medical Informatics Medicine-Health Informatics
CiteScore
7.90
自引率
3.10%
发文量
173
审稿时长
12 weeks
期刊介绍: JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信