构建包含路径信息的多发性骨髓瘤预后模型

IF 3.8 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

PLoS Computational Biology Pub Date : 2024-09-10 DOI:10.1371/journal.pcbi.1012444

Shuo Wang, ShanJin Wang, Wei Pan, YuYang Yi, Junyan Lu

{"title":"构建包含路径信息的多发性骨髓瘤预后模型","authors":"Shuo Wang, ShanJin Wang, Wei Pan, YuYang Yi, Junyan Lu","doi":"10.1371/journal.pcbi.1012444","DOIUrl":null,"url":null,"abstract":"Multiple myeloma (MM) is a hematological disease exhibiting aberrant clonal expansion of cancerous plasma cells in the bone marrow. The effects of treatments for MM vary between patients, highlighting the importance of developing prognostic models for informed therapeutic decision-making. Most previous models were constructed at the gene level, ignoring the fact that the dysfunction of the pathway is closely associated with disease development and progression. The present study considered two strategies that construct predictive models by taking pathway information into consideration: pathway score method and group lasso using pathway information. The former simply converted gene expression to sample-wise pathway scores for model fitting. We considered three methods for pathway score calculation (ssGSEA, GSVA, and z-scores) and 14 data sources providing pathway information. We implemented these methods in microarray data for MM (GSE136324) and obtained a candidate model with the best prediction performance in interval validation. The candidate model is further compared with the gene-based model and previously published models in two external data. We also investigated the effects of missing values on prediction. The results showed that group lasso incorporating Vax pathway information (Vax(grp)) was more competitive in prediction than the gene model in both internal and external validation. Immune information, including VAX pathways, seemed to be more predictive for MM. Vax(grp) also outperformed the previously published models. Moreover, the new model was more resistant to missing values, and the presence of missing values (<5%) would not evidently deteriorate its prediction accuracy using our missing data imputation method. In a nutshell, pathway-based models (using group lasso) were competitive alternatives to gene-based models for MM. These models were documented in an R package (https://github.com/ShuoStat/MMMs), where a missing data imputation method was also integrated to facilitate future validation.","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Construct prognostic models of multiple myeloma with pathway information incorporated\",\"authors\":\"Shuo Wang, ShanJin Wang, Wei Pan, YuYang Yi, Junyan Lu\",\"doi\":\"10.1371/journal.pcbi.1012444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multiple myeloma (MM) is a hematological disease exhibiting aberrant clonal expansion of cancerous plasma cells in the bone marrow. The effects of treatments for MM vary between patients, highlighting the importance of developing prognostic models for informed therapeutic decision-making. Most previous models were constructed at the gene level, ignoring the fact that the dysfunction of the pathway is closely associated with disease development and progression. The present study considered two strategies that construct predictive models by taking pathway information into consideration: pathway score method and group lasso using pathway information. The former simply converted gene expression to sample-wise pathway scores for model fitting. We considered three methods for pathway score calculation (ssGSEA, GSVA, and z-scores) and 14 data sources providing pathway information. We implemented these methods in microarray data for MM (GSE136324) and obtained a candidate model with the best prediction performance in interval validation. The candidate model is further compared with the gene-based model and previously published models in two external data. We also investigated the effects of missing values on prediction. The results showed that group lasso incorporating Vax pathway information (Vax(grp)) was more competitive in prediction than the gene model in both internal and external validation. Immune information, including VAX pathways, seemed to be more predictive for MM. Vax(grp) also outperformed the previously published models. Moreover, the new model was more resistant to missing values, and the presence of missing values (<5%) would not evidently deteriorate its prediction accuracy using our missing data imputation method. In a nutshell, pathway-based models (using group lasso) were competitive alternatives to gene-based models for MM. These models were documented in an R package (https://github.com/ShuoStat/MMMs), where a missing data imputation method was also integrated to facilitate future validation.\",\"PeriodicalId\":20241,\"journal\":{\"name\":\"PLoS Computational Biology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS Computational Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pcbi.1012444\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pcbi.1012444","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

多发性骨髓瘤（MM）是一种血液病，表现为骨髓中癌变浆细胞的异常克隆扩增。对 MM 的治疗效果因人而异，这凸显了开发预后模型以做出明智治疗决策的重要性。以前的大多数模型都是在基因水平上构建的，忽略了通路的功能障碍与疾病的发生和发展密切相关这一事实。本研究考虑了两种通过考虑通路信息构建预测模型的策略：通路评分法和利用通路信息的组套索法。前者简单地将基因表达转化为样本意义上的通路得分，用于模型拟合。我们考虑了三种通路得分计算方法（ssGSEA、GSVA 和 z-scores）和 14 个提供通路信息的数据源。我们在 MM 的微阵列数据（GSE136324）中实施了这些方法，并在区间验证中获得了预测性能最佳的候选模型。在两个外部数据中，候选模型进一步与基于基因的模型和以前发表的模型进行了比较。我们还研究了缺失值对预测的影响。结果表明，在内部和外部验证中，包含 Vax 通路信息的组套索（Vax(grp)）在预测方面比基因模型更有竞争力。包括 VAX 通路在内的免疫信息似乎更能预测 MM。Vax(grp)的表现也优于之前发表的模型。此外，新模型对缺失值的抵抗力更强，使用我们的缺失数据估算方法，缺失值（5%）的存在不会明显降低其预测准确性。简而言之，基于通路的模型（使用组套索）是基于基因的 MM 模型的竞争性替代品。这些模型被记录在一个 R 软件包（https://github.com/ShuoStat/MMMs）中，其中还集成了缺失数据估算方法，以方便将来的验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Construct prognostic models of multiple myeloma with pathway information incorporated

Multiple myeloma (MM) is a hematological disease exhibiting aberrant clonal expansion of cancerous plasma cells in the bone marrow. The effects of treatments for MM vary between patients, highlighting the importance of developing prognostic models for informed therapeutic decision-making. Most previous models were constructed at the gene level, ignoring the fact that the dysfunction of the pathway is closely associated with disease development and progression. The present study considered two strategies that construct predictive models by taking pathway information into consideration: pathway score method and group lasso using pathway information. The former simply converted gene expression to sample-wise pathway scores for model fitting. We considered three methods for pathway score calculation (ssGSEA, GSVA, and z-scores) and 14 data sources providing pathway information. We implemented these methods in microarray data for MM (GSE136324) and obtained a candidate model with the best prediction performance in interval validation. The candidate model is further compared with the gene-based model and previously published models in two external data. We also investigated the effects of missing values on prediction. The results showed that group lasso incorporating Vax pathway information (Vax(grp)) was more competitive in prediction than the gene model in both internal and external validation. Immune information, including VAX pathways, seemed to be more predictive for MM. Vax(grp) also outperformed the previously published models. Moreover, the new model was more resistant to missing values, and the presence of missing values (<5%) would not evidently deteriorate its prediction accuracy using our missing data imputation method. In a nutshell, pathway-based models (using group lasso) were competitive alternatives to gene-based models for MM. These models were documented in an R package (https://github.com/ShuoStat/MMMs), where a missing data imputation method was also integrated to facilitate future validation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PLoS Computational Biology BIOCHEMICAL RESEARCH METHODS-MATHEMATICAL & COMPUTATIONAL BIOLOGY

CiteScore

7.10

自引率

4.70%

发文量

820

审稿时长

2.5 months

期刊介绍： PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales—from molecules and cells, to patient populations and ecosystems—through the application of computational methods. Readers include life and computational scientists, who can take the important findings presented here to the next level of discovery. Research articles must be declared as belonging to a relevant section. More information about the sections can be found in the submission guidelines. Research articles should model aspects of biological systems, demonstrate both methodological and scientific novelty, and provide profound new biological insights. Generally, reliability and significance of biological discovery through computation should be validated and enriched by experimental studies. Inclusion of experimental validation is not required for publication, but should be referenced where possible. Inclusion of experimental validation of a modest biological discovery through computation does not render a manuscript suitable for PLOS Computational Biology. Research articles specifically designated as Methods papers should describe outstanding methods of exceptional importance that have been shown, or have the promise to provide new biological insights. The method must already be widely adopted, or have the promise of wide adoption by a broad community of users. Enhancements to existing published methods will only be considered if those enhancements bring exceptional new capabilities.