Differentiating Gliosarcoma from Glioblastoma: A Novel Approach Using PEACE and XGBoost to Deal with Datasets with Ultra-High Dimensional Confounders

Life Pub Date : 2024-07-16 DOI:10.3390/life14070882
A. Saki, U. Faghihi, Ismaila Baldé
{"title":"Differentiating Gliosarcoma from Glioblastoma: A Novel Approach Using PEACE and XGBoost to Deal with Datasets with Ultra-High Dimensional Confounders","authors":"A. Saki, U. Faghihi, Ismaila Baldé","doi":"10.3390/life14070882","DOIUrl":null,"url":null,"abstract":"In this study, we used a recently developed causal methodology, called Probabilistic Easy Variational Causal Effect (PEACE), to distinguish gliosarcoma (GSM) from glioblastoma (GBM). Our approach uses a causal metric which combines Probabilistic Easy Variational Causal Effect (PEACE) with the XGBoost, or eXtreme Gradient Boosting, algorithm. Unlike prior research, which often relied on statistical models to reduce dataset dimensions before causal analysis, our approach uses the complete dataset with PEACE and the XGBoost algorithm. PEACE provides a comprehensive measurement of direct causal effects, applicable to both continuous and discrete variables. Our method provides both positive and negative versions of PEACE together with their averages to calculate the positive and negative causal effects of the radiomic features on the variable representing the type of tumor (GSM or GBM). In our model, PEACE and its variations are equipped with a degree d which varies from 0 to 1 and it reflects the importance of the rarity and frequency of the events. By using PEACE with XGBoost, we achieved a detailed and nuanced understanding of the causal relationships within the dataset features, facilitating accurate differentiation between GSM and GBM. To assess the XGBoost model, we used cross-validation and obtained a mean accuracy of 83% and an average model MSE of 0.130. This performance is notable given the high number of columns and low number of rows (code on GitHub).","PeriodicalId":18182,"journal":{"name":"Life","volume":"6 13","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Life","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/life14070882","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this study, we used a recently developed causal methodology, called Probabilistic Easy Variational Causal Effect (PEACE), to distinguish gliosarcoma (GSM) from glioblastoma (GBM). Our approach uses a causal metric which combines Probabilistic Easy Variational Causal Effect (PEACE) with the XGBoost, or eXtreme Gradient Boosting, algorithm. Unlike prior research, which often relied on statistical models to reduce dataset dimensions before causal analysis, our approach uses the complete dataset with PEACE and the XGBoost algorithm. PEACE provides a comprehensive measurement of direct causal effects, applicable to both continuous and discrete variables. Our method provides both positive and negative versions of PEACE together with their averages to calculate the positive and negative causal effects of the radiomic features on the variable representing the type of tumor (GSM or GBM). In our model, PEACE and its variations are equipped with a degree d which varies from 0 to 1 and it reflects the importance of the rarity and frequency of the events. By using PEACE with XGBoost, we achieved a detailed and nuanced understanding of the causal relationships within the dataset features, facilitating accurate differentiation between GSM and GBM. To assess the XGBoost model, we used cross-validation and obtained a mean accuracy of 83% and an average model MSE of 0.130. This performance is notable given the high number of columns and low number of rows (code on GitHub).
区分胶质肉瘤和胶质母细胞瘤:利用 PEACE 和 XGBoost 处理超高维度混杂因素数据集的新方法
在这项研究中,我们使用了最近开发的一种因果关系方法,即概率易变因果效应(PEACE),来区分胶质肉瘤(GSM)和胶质母细胞瘤(GBM)。我们的方法采用了一种因果度量方法,将概率易变因果效应(PEACE)与 XGBoost(即梯度提升算法)相结合。以往的研究通常在进行因果分析前依赖于统计模型来减少数据集的维度,而我们的方法则不同,它使用了完整的数据集、PEACE 和 XGBoost 算法。PEACE 提供了对直接因果效应的全面测量,适用于连续和离散变量。我们的方法提供正负两个版本的 PEACE 及其平均值,以计算放射学特征对代表肿瘤类型(GSM 或 GBM)的变量的正负因果效应。在我们的模型中,PEACE 及其变体都带有度数 d,度数从 0 到 1 不等,它反映了事件稀有性和频率的重要性。通过将 PEACE 与 XGBoost 结合使用,我们对数据集特征中的因果关系有了细致入微的了解,从而有助于准确区分 GSM 和 GBM。为了评估 XGBoost 模型,我们使用了交叉验证,获得了 83% 的平均准确率和 0.130 的平均模型 MSE。鉴于列数较多而行数较少,这一性能非常显著(代码在 GitHub 上)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信