基于机器学习的血浆来源的细胞外囊泡特征用于消化系统癌症预测

IF 4.8 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Xiaowei Qin , Zhibin Bi , Wenbin Li , Huipeng Zhang , Ming Han , Kongxi Zhang , Jian Wu , Lei Huang
{"title":"基于机器学习的血浆来源的细胞外囊泡特征用于消化系统癌症预测","authors":"Xiaowei Qin ,&nbsp;Zhibin Bi ,&nbsp;Wenbin Li ,&nbsp;Huipeng Zhang ,&nbsp;Ming Han ,&nbsp;Kongxi Zhang ,&nbsp;Jian Wu ,&nbsp;Lei Huang","doi":"10.1016/j.cmpb.2025.109064","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Digestive system cancers (DSCs) represent a heterogeneous group of malignancies characterized by a poor prognosis and a lack of accurate early diagnostic methods. While traditional serological biomarkers and non-coding RNA continue to be commonly diagnostic marker for these cancers, their sensitivity and specificity in detection are often limited. RNA in plasma-derived extracellular vesicles (PDEV) has emerged as a promising diagnostic tool for a variety of cancers, but its application in the detection of various DSCs has not yet been fully explored.</div></div><div><h3>Methods</h3><div>By integrating PDEV sequencing data from the exoRBase 2.0 database, a total of 444 participants were included in the study, including 326 patients of DSCs, and 118 healthy individuals. The dataset was divided into training and test sets. The PDEV-diagnostic model was constructed using various machine learning algorithms and underwent 5-fold cross-validation in the training sets. The model's performance metrics were further evaluated in the test set. Additionally, the features were assessed using bulk RNA-seq and single RNA-seq datasets for different DSCs.</div></div><div><h3>Results</h3><div>Based on various feature selection methods and a comparison of 10 machine learning algorithms using seven metrics, the XGBoost model was selected as the PDEV-diagnostic model, with an AUC of 0.83 and 0.94 in the training and test sets, respectively, and 9 exosome predictors, including BANK1, MALAT1, FGA, UBR4, ILR-7,FGB, PLPP5,PCAT19, and CIITA for DSCs prediction.</div></div><div><h3>Conclusions</h3><div>The machine learning-based PDEV diagnostic models exhibit remarkable accuracy in identifying patients of DSCs. These nine exosomal mRNAs/lncRNAs consequently showed promise as non-invasive biomarkers for DSCs diagnosis.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"272 ","pages":"Article 109064"},"PeriodicalIF":4.8000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based plasma-derived extracellular vesicle signatures for digestive system cancers prediction\",\"authors\":\"Xiaowei Qin ,&nbsp;Zhibin Bi ,&nbsp;Wenbin Li ,&nbsp;Huipeng Zhang ,&nbsp;Ming Han ,&nbsp;Kongxi Zhang ,&nbsp;Jian Wu ,&nbsp;Lei Huang\",\"doi\":\"10.1016/j.cmpb.2025.109064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Digestive system cancers (DSCs) represent a heterogeneous group of malignancies characterized by a poor prognosis and a lack of accurate early diagnostic methods. While traditional serological biomarkers and non-coding RNA continue to be commonly diagnostic marker for these cancers, their sensitivity and specificity in detection are often limited. RNA in plasma-derived extracellular vesicles (PDEV) has emerged as a promising diagnostic tool for a variety of cancers, but its application in the detection of various DSCs has not yet been fully explored.</div></div><div><h3>Methods</h3><div>By integrating PDEV sequencing data from the exoRBase 2.0 database, a total of 444 participants were included in the study, including 326 patients of DSCs, and 118 healthy individuals. The dataset was divided into training and test sets. The PDEV-diagnostic model was constructed using various machine learning algorithms and underwent 5-fold cross-validation in the training sets. The model's performance metrics were further evaluated in the test set. Additionally, the features were assessed using bulk RNA-seq and single RNA-seq datasets for different DSCs.</div></div><div><h3>Results</h3><div>Based on various feature selection methods and a comparison of 10 machine learning algorithms using seven metrics, the XGBoost model was selected as the PDEV-diagnostic model, with an AUC of 0.83 and 0.94 in the training and test sets, respectively, and 9 exosome predictors, including BANK1, MALAT1, FGA, UBR4, ILR-7,FGB, PLPP5,PCAT19, and CIITA for DSCs prediction.</div></div><div><h3>Conclusions</h3><div>The machine learning-based PDEV diagnostic models exhibit remarkable accuracy in identifying patients of DSCs. These nine exosomal mRNAs/lncRNAs consequently showed promise as non-invasive biomarkers for DSCs diagnosis.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"272 \",\"pages\":\"Article 109064\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S016926072500481X\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016926072500481X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

消化系统癌症(dsc)是一种异质性的恶性肿瘤,其特点是预后差,缺乏准确的早期诊断方法。虽然传统的血清学生物标志物和非编码RNA仍然是这些癌症的常用诊断标志物,但它们在检测中的敏感性和特异性往往受到限制。血浆源性细胞外囊泡(PDEV)中的RNA已成为一种有前景的多种癌症诊断工具,但其在检测各种dsc中的应用尚未得到充分探索。方法通过整合来自exoRBase 2.0数据库的PDEV测序数据,共纳入444名参与者,其中包括326名dsc患者和118名健康个体。数据集分为训练集和测试集。pdev诊断模型使用多种机器学习算法构建,并在训练集中进行了5次交叉验证。在测试集中进一步评估模型的性能指标。此外,使用不同dsc的大量RNA-seq和单个RNA-seq数据集评估这些特征。结果基于各种特征选择方法和使用7个指标的10种机器学习算法的比较,选择XGBoost模型作为pdevs诊断模型,其在训练集和测试集的AUC分别为0.83和0.94,并选择9个外泌体预测因子,包括BANK1、MALAT1、FGA、UBR4、ILR-7、FGB、PLPP5、PCAT19和CIITA用于dsc预测。结论基于机器学习的PDEV诊断模型在诊断dsc患者方面具有显著的准确性。因此,这9种外泌体mrna /lncRNAs有望成为诊断dsc的非侵入性生物标志物。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine learning-based plasma-derived extracellular vesicle signatures for digestive system cancers prediction

Background

Digestive system cancers (DSCs) represent a heterogeneous group of malignancies characterized by a poor prognosis and a lack of accurate early diagnostic methods. While traditional serological biomarkers and non-coding RNA continue to be commonly diagnostic marker for these cancers, their sensitivity and specificity in detection are often limited. RNA in plasma-derived extracellular vesicles (PDEV) has emerged as a promising diagnostic tool for a variety of cancers, but its application in the detection of various DSCs has not yet been fully explored.

Methods

By integrating PDEV sequencing data from the exoRBase 2.0 database, a total of 444 participants were included in the study, including 326 patients of DSCs, and 118 healthy individuals. The dataset was divided into training and test sets. The PDEV-diagnostic model was constructed using various machine learning algorithms and underwent 5-fold cross-validation in the training sets. The model's performance metrics were further evaluated in the test set. Additionally, the features were assessed using bulk RNA-seq and single RNA-seq datasets for different DSCs.

Results

Based on various feature selection methods and a comparison of 10 machine learning algorithms using seven metrics, the XGBoost model was selected as the PDEV-diagnostic model, with an AUC of 0.83 and 0.94 in the training and test sets, respectively, and 9 exosome predictors, including BANK1, MALAT1, FGA, UBR4, ILR-7,FGB, PLPP5,PCAT19, and CIITA for DSCs prediction.

Conclusions

The machine learning-based PDEV diagnostic models exhibit remarkable accuracy in identifying patients of DSCs. These nine exosomal mRNAs/lncRNAs consequently showed promise as non-invasive biomarkers for DSCs diagnosis.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信