An explainable deep learning model to predict partial anomalous pulmonary venous connection for patients with atrial septal defect.

IF 2 3区 医学 Q2 PEDIATRICS
Gang Luo, Zhixin Li, Zhixian Ji, Sibao Wang, Silin Pan
{"title":"An explainable deep learning model to predict partial anomalous pulmonary venous connection for patients with atrial septal defect.","authors":"Gang Luo, Zhixin Li, Zhixian Ji, Sibao Wang, Silin Pan","doi":"10.1186/s12887-024-05193-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Patients with partial anomalous pulmonary venous connection (PAPVC) usually present asymptomatic and accompanied by intricate anatomical types, which results in missed diagnosis from atrial septal defect (ASD). The present study aimed to explore the predictive variables of PAPVC from patients with ASD and constructed an explainable prediction model based on deep learning.</p><p><strong>Methods: </strong>The retrospective study included 834 inpatients with ASD in Women and Children's Hospital, Qingdao University from January 2018 to January 2023. They were separated into two groups based on the presence of PAPVC. Propensity score matching and SMOTE were used to balance the baseline data between groups. The differential variables between the two groups were determined by univariate logistic regression. The patients were randomly divided into the training set and the validation set in a ratio of 8:2. Support vector machines (SVM), Random forest, Decision tree, XGBoost, and LightGBM were used to build models by differential variables. The classification performance of models was compared. Split, gain and SHAP were used to measure the importance of differential variables and improve the interpretability of the model. Moreover, a portion of the patients was included in the validation set to test the performance of the selected models.</p><p><strong>Results: </strong>Three hundred twenty-eight patients with ASD and patients with 82 PAPVC were included in the training set and the validation set, respectively. The selection of 10 differential variables was based on univariate logistic regression, including right atrial diameter (longitudinal axis and transverse axis), right ventricular diameter, left atrial diameter, left ventricular end-diastolic diameter, left ventricular end-systolic diameter, P-wave voltage, P-wave interval PR interval, and QRS-wave voltage. In the classification model established based on differential variables, the LightGBM model achieved the highest performance on the validation set (AUC = 0.93). Based on variables importance analysis, the LightGBM-Clinic model was retrained by P-wave voltage, P-wave interval, PR interval, QRS wave interval, and right ventricular diameter, and performed excellently (AUC = 0.90). The AUC of the LightGBM-Clinic model was 0.87 in the test set.</p><p><strong>Conclusion: </strong>In this study, the LightGBM model performs excellently in determining whether patients with ASD are accompanied by PAPVC. ECG parameters such as P-wave voltage were important to predictive value and enhance the explainability of the model.</p>","PeriodicalId":9144,"journal":{"name":"BMC Pediatrics","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11546076/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Pediatrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12887-024-05193-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PEDIATRICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Patients with partial anomalous pulmonary venous connection (PAPVC) usually present asymptomatic and accompanied by intricate anatomical types, which results in missed diagnosis from atrial septal defect (ASD). The present study aimed to explore the predictive variables of PAPVC from patients with ASD and constructed an explainable prediction model based on deep learning.

Methods: The retrospective study included 834 inpatients with ASD in Women and Children's Hospital, Qingdao University from January 2018 to January 2023. They were separated into two groups based on the presence of PAPVC. Propensity score matching and SMOTE were used to balance the baseline data between groups. The differential variables between the two groups were determined by univariate logistic regression. The patients were randomly divided into the training set and the validation set in a ratio of 8:2. Support vector machines (SVM), Random forest, Decision tree, XGBoost, and LightGBM were used to build models by differential variables. The classification performance of models was compared. Split, gain and SHAP were used to measure the importance of differential variables and improve the interpretability of the model. Moreover, a portion of the patients was included in the validation set to test the performance of the selected models.

Results: Three hundred twenty-eight patients with ASD and patients with 82 PAPVC were included in the training set and the validation set, respectively. The selection of 10 differential variables was based on univariate logistic regression, including right atrial diameter (longitudinal axis and transverse axis), right ventricular diameter, left atrial diameter, left ventricular end-diastolic diameter, left ventricular end-systolic diameter, P-wave voltage, P-wave interval PR interval, and QRS-wave voltage. In the classification model established based on differential variables, the LightGBM model achieved the highest performance on the validation set (AUC = 0.93). Based on variables importance analysis, the LightGBM-Clinic model was retrained by P-wave voltage, P-wave interval, PR interval, QRS wave interval, and right ventricular diameter, and performed excellently (AUC = 0.90). The AUC of the LightGBM-Clinic model was 0.87 in the test set.

Conclusion: In this study, the LightGBM model performs excellently in determining whether patients with ASD are accompanied by PAPVC. ECG parameters such as P-wave voltage were important to predictive value and enhance the explainability of the model.

预测房间隔缺损患者部分异常肺静脉连接的可解释深度学习模型。
背景:部分肺静脉连接异常(PAPVC)患者通常无症状,并伴有复杂的解剖类型,这导致了与房间隔缺损(ASD)的漏诊。本研究旨在探索 ASD 患者 PAPVC 的预测变量,并基于深度学习构建可解释的预测模型:该回顾性研究纳入了2018年1月至2023年1月青岛大学附属妇女儿童医院的834例ASD住院患者。根据是否存在 PAPVC 将他们分为两组。采用倾向得分匹配和SMOTE来平衡组间基线数据。通过单变量逻辑回归确定两组之间的差异变量。患者按 8:2 的比例随机分为训练集和验证集。支持向量机(SVM)、随机森林(Random forest)、决策树(Decision tree)、XGBoost 和 LightGBM 被用于根据差异变量建立模型。对模型的分类性能进行了比较。使用Split、gain和SHAP来衡量差异变量的重要性,提高模型的可解释性。此外,部分患者被纳入验证集,以测试所选模型的性能:328 名 ASD 患者和 82 名 PAPVC 患者分别被纳入训练集和验证集。在单变量逻辑回归的基础上选择了10个差异变量,包括右心房直径(纵轴和横轴)、右心室直径、左心房直径、左心室舒张末期直径、左心室收缩末期直径、P波电压、P波间期PR间期和QRS波电压。在基于差异变量建立的分类模型中,LightGBM 模型在验证集上的性能最高(AUC = 0.93)。基于变量重要性分析,LightGBM-Clinic 模型通过 P 波电压、P 波间期、PR 波间期、QRS 波间期和右心室直径进行了再训练,表现优异(AUC = 0.90)。在测试集中,LightGBM-Clinic 模型的 AUC 为 0.87:在这项研究中,LightGBM 模型在判断 ASD 患者是否伴有 PAPVC 方面表现出色。P波电压等心电图参数对预测价值非常重要,并增强了模型的可解释性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Pediatrics
BMC Pediatrics PEDIATRICS-
CiteScore
3.70
自引率
4.20%
发文量
683
审稿时长
3-8 weeks
期刊介绍: BMC Pediatrics is an open access journal publishing peer-reviewed research articles in all aspects of health care in neonates, children and adolescents, as well as related molecular genetics, pathophysiology, and epidemiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信