Development and validation of machine learning models for predicting cancer-related fatigue in lymphoma survivors

IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
{"title":"Development and validation of machine learning models for predicting cancer-related fatigue in lymphoma survivors","authors":"","doi":"10.1016/j.ijmedinf.2024.105630","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>New cases of lymphoma are rising, and the symptom burden, like cancer-related fatigue (CRF), severely impacts the quality of life of lymphoma survivors. However, clinical diagnosis and treatment of CRF are inadequate and require enhancement.</p></div><div><h3>Objective</h3><p>The main objective of this study is to construct machine learning-based CRF prediction models for lymphoma survivors to help healthcare professionals accurately identify the CRF population and better personalize treatment and care for patients.</p></div><div><h3>Methods</h3><p>A cross-sectional study in China recruited lymphoma patients from June 2023 to March 2024, dividing them into two datasets for model construction and external validation. Six machine learning algorithms were used in this study: Logistic Regression (LR), Random Forest, Single Hidden Layer Neural Network, Support Vector Machine, eXtreme Gradient Boosting, and Light Gradient Boosting Machine (LightGBM). Performance metrics like the area under the receiver operating characteristic (AUROC) and calibration curves were compared. The clinical applicability was assessed by decision curve, and Shapley additive explanations was employed to explain variable significance.</p></div><div><h3>Results</h3><p>CRF incidence was 40.7 % (dataset I) and 44.8 % (dataset II). LightGBM showed strong performance in training and internal validation. LR excelled in external validation with the highest AUROC and best calibration. Pain, total protein, physical function, and sleep disturbance were important predictors of CRF.</p></div><div><h3>Conclusion</h3><p>The study presents a machine learning-based CRF prediction model for lymphoma patients, offering dynamic, data-driven assessments that could enhance the development of automated CRF screening tools for personalized management in clinical practice.</p></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624002934","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

New cases of lymphoma are rising, and the symptom burden, like cancer-related fatigue (CRF), severely impacts the quality of life of lymphoma survivors. However, clinical diagnosis and treatment of CRF are inadequate and require enhancement.

Objective

The main objective of this study is to construct machine learning-based CRF prediction models for lymphoma survivors to help healthcare professionals accurately identify the CRF population and better personalize treatment and care for patients.

Methods

A cross-sectional study in China recruited lymphoma patients from June 2023 to March 2024, dividing them into two datasets for model construction and external validation. Six machine learning algorithms were used in this study: Logistic Regression (LR), Random Forest, Single Hidden Layer Neural Network, Support Vector Machine, eXtreme Gradient Boosting, and Light Gradient Boosting Machine (LightGBM). Performance metrics like the area under the receiver operating characteristic (AUROC) and calibration curves were compared. The clinical applicability was assessed by decision curve, and Shapley additive explanations was employed to explain variable significance.

Results

CRF incidence was 40.7 % (dataset I) and 44.8 % (dataset II). LightGBM showed strong performance in training and internal validation. LR excelled in external validation with the highest AUROC and best calibration. Pain, total protein, physical function, and sleep disturbance were important predictors of CRF.

Conclusion

The study presents a machine learning-based CRF prediction model for lymphoma patients, offering dynamic, data-driven assessments that could enhance the development of automated CRF screening tools for personalized management in clinical practice.

开发和验证用于预测淋巴瘤幸存者癌症相关疲劳的机器学习模型
背景淋巴瘤新发病例不断增加,癌症相关疲劳(CRF)等症状严重影响了淋巴瘤幸存者的生活质量。本研究的主要目的是为淋巴瘤幸存者构建基于机器学习的CRF预测模型,以帮助医护人员准确识别CRF人群,更好地为患者提供个性化治疗和护理。方法一项横断面研究在中国招募了2023年6月至2024年3月期间的淋巴瘤患者,将其分为两个数据集用于模型构建和外部验证。本研究使用了六种机器学习算法:逻辑回归(Logistic Regression,LR)、随机森林(Random Forest)、单隐层神经网络(Single Hidden Layer Neural Network)、支持向量机(Support Vector Machine)、极梯度提升(eXtreme Gradient Boosting)和轻梯度提升机(Light Gradient Boosting Machine,LightGBM)。比较了接收者操作特征下面积(AUROC)和校准曲线等性能指标。结果CRF发生率为40.7%(数据集I)和44.8%(数据集II)。LightGBM 在训练和内部验证中表现出色。LR 在外部验证中表现出色,AUROC 最高,校准效果最好。该研究提出了一种基于机器学习的淋巴瘤患者CRF预测模型,该模型可提供动态、数据驱动的评估,可促进自动CRF筛查工具的开发,从而在临床实践中实现个性化管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Medical Informatics
International Journal of Medical Informatics 医学-计算机:信息系统
CiteScore
8.90
自引率
4.10%
发文量
217
审稿时长
42 days
期刊介绍: International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings. The scope of journal covers: Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.; Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc. Educational computer based programs pertaining to medical informatics or medicine in general; Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信