GBMPurity:一个从大量RNA-seq数据估计胶质母细胞瘤肿瘤纯度的机器学习工具。

IF 13.4 1区 医学 Q1 CLINICAL NEUROLOGY
Morgan P H Thomas, Shoaib Ajaib, Georgette Tanner, Andrew J Bulpitt, Lucy F Stead
{"title":"GBMPurity:一个从大量RNA-seq数据估计胶质母细胞瘤肿瘤纯度的机器学习工具。","authors":"Morgan P H Thomas, Shoaib Ajaib, Georgette Tanner, Andrew J Bulpitt, Lucy F Stead","doi":"10.1093/neuonc/noaf026","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Glioblastoma (GBM) presents a significant clinical challenge due to its aggressive nature and extensive heterogeneity. Tumor purity, the proportion of malignant cells within a tumor, is an important covariate for understanding the disease, having direct clinical relevance or obscuring signal of the malignant portion in molecular analyses of bulk samples. However, current methods for estimating tumor purity are nonspecific and technically demanding. Therefore, we aimed to build a reliable and accessible purity estimator for GBM.</p><p><strong>Methods: </strong>We developed GBMPurity, a deep learning model specifically designed to estimate the purity of IDH-wild type primary GBM from bulk RNA-sequencing (RNA-seq) data. The model was trained using simulated pseudobulk tumors of known purity from labeled single-cell data acquired from the GBmap resource. The performance of GBMPurity was evaluated and compared to several existing tools using independent datasets.</p><p><strong>Results: </strong>GBMPurity outperformed existing tools, achieving a mean absolute error of 0.15 and a concordance correlation coefficient of 0.88 on validation datasets. We demonstrate the utility of GBMPurity through inference on bulk RNA-seq samples and observe reduced purity of the proneural molecular subtype relative to the classical, attributed to the increased presence of healthy brain cells.</p><p><strong>Conclusions: </strong>GBMPurity provides a reliable and accessible tool for estimating tumor purity from bulk RNA-seq data, enhancing the interpretation of bulk RNA-seq data and offering valuable insights into GBM biology. To facilitate the use of this model by the wider research community, GBMPurity is available as a web-based tool at: https://gbmdeconvoluter.leeds.ac.uk/.</p>","PeriodicalId":19377,"journal":{"name":"Neuro-oncology","volume":" ","pages":"1458-1473"},"PeriodicalIF":13.4000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309721/pdf/","citationCount":"0","resultStr":"{\"title\":\"GBMPurity: A machine learning tool for estimating glioblastoma tumor purity from bulk RNA-sequencing data.\",\"authors\":\"Morgan P H Thomas, Shoaib Ajaib, Georgette Tanner, Andrew J Bulpitt, Lucy F Stead\",\"doi\":\"10.1093/neuonc/noaf026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Glioblastoma (GBM) presents a significant clinical challenge due to its aggressive nature and extensive heterogeneity. Tumor purity, the proportion of malignant cells within a tumor, is an important covariate for understanding the disease, having direct clinical relevance or obscuring signal of the malignant portion in molecular analyses of bulk samples. However, current methods for estimating tumor purity are nonspecific and technically demanding. Therefore, we aimed to build a reliable and accessible purity estimator for GBM.</p><p><strong>Methods: </strong>We developed GBMPurity, a deep learning model specifically designed to estimate the purity of IDH-wild type primary GBM from bulk RNA-sequencing (RNA-seq) data. The model was trained using simulated pseudobulk tumors of known purity from labeled single-cell data acquired from the GBmap resource. The performance of GBMPurity was evaluated and compared to several existing tools using independent datasets.</p><p><strong>Results: </strong>GBMPurity outperformed existing tools, achieving a mean absolute error of 0.15 and a concordance correlation coefficient of 0.88 on validation datasets. We demonstrate the utility of GBMPurity through inference on bulk RNA-seq samples and observe reduced purity of the proneural molecular subtype relative to the classical, attributed to the increased presence of healthy brain cells.</p><p><strong>Conclusions: </strong>GBMPurity provides a reliable and accessible tool for estimating tumor purity from bulk RNA-seq data, enhancing the interpretation of bulk RNA-seq data and offering valuable insights into GBM biology. To facilitate the use of this model by the wider research community, GBMPurity is available as a web-based tool at: https://gbmdeconvoluter.leeds.ac.uk/.</p>\",\"PeriodicalId\":19377,\"journal\":{\"name\":\"Neuro-oncology\",\"volume\":\" \",\"pages\":\"1458-1473\"},\"PeriodicalIF\":13.4000,\"publicationDate\":\"2025-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309721/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neuro-oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/neuonc/noaf026\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuro-oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/neuonc/noaf026","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:胶质母细胞瘤(GBM)由于其侵袭性和广泛的异质性而面临着重大的临床挑战。肿瘤纯度,即肿瘤内恶性细胞的比例,是了解疾病的重要协变量,在大量样本的分子分析中具有直接的临床相关性或模糊恶性部分的信号。然而,目前估计肿瘤纯度的方法是非特异性的,技术要求很高。因此,我们的目标是建立一个可靠和易于使用的GBM纯度估计器。方法:我们开发了GBMPurity,这是一个深度学习模型,专门用于从大量RNA-seq数据中估计idh -野生型原发性GBM的纯度。该模型使用从GBmap资源中获得的标记单细胞数据中已知纯度的模拟假体肿瘤进行训练。使用独立数据集对GBMPurity的性能进行了评估,并与几种现有工具进行了比较。结果:GBMPurity优于现有工具,在验证数据集上的平均绝对误差为0.15,一致性相关系数为0.88。我们通过对大量RNA-seq样本的推断证明了GBMPurity的实用性,并观察到prooneural分子亚型相对于classic的纯度降低,这归因于健康脑细胞的增加。结论:GBMPurity为从大量RNA-seq数据中估计肿瘤纯度提供了一种可靠且易于获取的工具,增强了对大量RNA-seq数据的解释,并为GBM生物学提供了有价值的见解。为了方便更广泛的研究界使用这个模型,GBMPurity作为一个基于网络的工具可以在:https://gbmdeconvoluter.leeds.ac.uk/上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GBMPurity: A machine learning tool for estimating glioblastoma tumor purity from bulk RNA-sequencing data.

Background: Glioblastoma (GBM) presents a significant clinical challenge due to its aggressive nature and extensive heterogeneity. Tumor purity, the proportion of malignant cells within a tumor, is an important covariate for understanding the disease, having direct clinical relevance or obscuring signal of the malignant portion in molecular analyses of bulk samples. However, current methods for estimating tumor purity are nonspecific and technically demanding. Therefore, we aimed to build a reliable and accessible purity estimator for GBM.

Methods: We developed GBMPurity, a deep learning model specifically designed to estimate the purity of IDH-wild type primary GBM from bulk RNA-sequencing (RNA-seq) data. The model was trained using simulated pseudobulk tumors of known purity from labeled single-cell data acquired from the GBmap resource. The performance of GBMPurity was evaluated and compared to several existing tools using independent datasets.

Results: GBMPurity outperformed existing tools, achieving a mean absolute error of 0.15 and a concordance correlation coefficient of 0.88 on validation datasets. We demonstrate the utility of GBMPurity through inference on bulk RNA-seq samples and observe reduced purity of the proneural molecular subtype relative to the classical, attributed to the increased presence of healthy brain cells.

Conclusions: GBMPurity provides a reliable and accessible tool for estimating tumor purity from bulk RNA-seq data, enhancing the interpretation of bulk RNA-seq data and offering valuable insights into GBM biology. To facilitate the use of this model by the wider research community, GBMPurity is available as a web-based tool at: https://gbmdeconvoluter.leeds.ac.uk/.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neuro-oncology
Neuro-oncology 医学-临床神经学
CiteScore
27.20
自引率
6.30%
发文量
1434
审稿时长
3-8 weeks
期刊介绍: Neuro-Oncology, the official journal of the Society for Neuro-Oncology, has been published monthly since January 2010. Affiliated with the Japan Society for Neuro-Oncology and the European Association of Neuro-Oncology, it is a global leader in the field. The journal is committed to swiftly disseminating high-quality information across all areas of neuro-oncology. It features peer-reviewed articles, reviews, symposia on various topics, abstracts from annual meetings, and updates from neuro-oncology societies worldwide.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信