Development and validation of an interpretable risk prediction model for the early classification of thalassemia

IF 15.1 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES
Jin-Xin Lai, Jia-Wei Tang, Shan-Shan Gong, Ming-Xiong Qin, Yu-Lu Zhang, Quan-Fa Liang, Li-Yan Li, Zhen Cai, Liang Wang
{"title":"Development and validation of an interpretable risk prediction model for the early classification of thalassemia","authors":"Jin-Xin Lai, Jia-Wei Tang, Shan-Shan Gong, Ming-Xiong Qin, Yu-Lu Zhang, Quan-Fa Liang, Li-Yan Li, Zhen Cai, Liang Wang","doi":"10.1038/s41746-025-01766-0","DOIUrl":null,"url":null,"abstract":"<p>Thalassemia is an inherited blood disorder. Current diagnostic methods mainly rely on sophisticated equipment and specifically trained technicians. This study aims to identify and genotype thalassemia by applying machine learning (ML) algorithms to routine blood parameters. This study recruited a derivation cohort of 31,311 individuals from four independent hospitals and developed eight machine learning (ML) models for the purpose. The performance of these models was compared using a set of evaluation metrics. An additional cohort of 2000 patients was recruited for external validation to assess the generalization of the models. The results demonstrated that the categorical boosting (CatBoost) model exhibited the best discriminative ability in both the training and external validation cohorts. The model was then integrated into an online platform, which holds the potential to act as an auxiliary tool for identifying and genotyping thalassemia via automatic analysis of routine blood test parameters.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"216 1","pages":""},"PeriodicalIF":15.1000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Digital Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41746-025-01766-0","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Thalassemia is an inherited blood disorder. Current diagnostic methods mainly rely on sophisticated equipment and specifically trained technicians. This study aims to identify and genotype thalassemia by applying machine learning (ML) algorithms to routine blood parameters. This study recruited a derivation cohort of 31,311 individuals from four independent hospitals and developed eight machine learning (ML) models for the purpose. The performance of these models was compared using a set of evaluation metrics. An additional cohort of 2000 patients was recruited for external validation to assess the generalization of the models. The results demonstrated that the categorical boosting (CatBoost) model exhibited the best discriminative ability in both the training and external validation cohorts. The model was then integrated into an online platform, which holds the potential to act as an auxiliary tool for identifying and genotyping thalassemia via automatic analysis of routine blood test parameters.

Abstract Image

开发和验证可解释的地中海贫血早期分类风险预测模型
地中海贫血是一种遗传性血液疾病。目前的诊断方法主要依靠精密的设备和经过专门培训的技术人员。本研究旨在通过将机器学习(ML)算法应用于常规血液参数来识别和分型地中海贫血。这项研究招募了来自四家独立医院的31311人的衍生队列,并为此开发了8个机器学习(ML)模型。使用一组评估指标对这些模型的性能进行比较。另外一组2000名患者被招募进行外部验证,以评估模型的泛化性。结果表明,CatBoost模型在训练组和外部验证组中都表现出最好的判别能力。该模型随后被整合到一个在线平台中,该平台有可能作为一种辅助工具,通过对常规血液检测参数的自动分析来识别和分型地中海贫血。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
25.10
自引率
3.30%
发文量
170
审稿时长
15 weeks
期刊介绍: npj Digital Medicine is an online open-access journal that focuses on publishing peer-reviewed research in the field of digital medicine. The journal covers various aspects of digital medicine, including the application and implementation of digital and mobile technologies in clinical settings, virtual healthcare, and the use of artificial intelligence and informatics. The primary goal of the journal is to support innovation and the advancement of healthcare through the integration of new digital and mobile technologies. When determining if a manuscript is suitable for publication, the journal considers four important criteria: novelty, clinical relevance, scientific rigor, and digital innovation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信