QIGTD: identifying critical genes in the evolution of lung adenocarcinoma with tensor decomposition.

IF 4 3区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Biodata Mining Pub Date : 2024-09-04 DOI:10.1186/s13040-024-00386-w

Bolin Chen, Jinlei Zhang, Ci Shao, Jun Bian, Ruiming Kang, Xuequn Shang

{"title":"QIGTD: identifying critical genes in the evolution of lung adenocarcinoma with tensor decomposition.","authors":"Bolin Chen, Jinlei Zhang, Ci Shao, Jun Bian, Ruiming Kang, Xuequn Shang","doi":"10.1186/s13040-024-00386-w","DOIUrl":null,"url":null,"abstract":"Background: Identifying critical genes is important for understanding the pathogenesis of complex diseases. Traditional studies typically comparing the change of biomecules between normal and disease samples or detecting important vertices from a single static biomolecular network, which often overlook the dynamic changes that occur between different disease stages. However, investigating temporal changes in biomolecular networks and identifying critical genes is critical for understanding the occurrence and development of diseases.Methods: A novel method called Quantifying Importance of Genes with Tensor Decomposition (QIGTD) was proposed in this study. It first constructs a time series network by integrating both the intra and inter temporal network information, which preserving connections between networks at adjacent stages according to the local similarities. A tensor is employed to describe the connections of this time series network, and a 3-order tensor decomposition method was proposed to capture both the topological information of each network snapshot and the time series characteristics of the whole network. QIGTD is also a learning-free and efficient method that can be applied to datasets with a small number of samples.Results: The effectiveness of QIGTD was evaluated using lung adenocarcinoma (LUAD) datasets and three state-of-the-art methods: T-degree, T-closeness, and T-betweenness were employed as benchmark methods. Numerical experimental results demonstrate that QIGTD outperforms these methods in terms of the indices of both precision and mAP. Notably, out of the top 50 genes, 29 have been verified to be highly related to LUAD according to the DisGeNET Database, and 36 are significantly enriched in LUAD related Gene Ontology (GO) terms, including nuclear division, mitotic nuclear division, chromosome segregation, organelle fission, and mitotic sister chromatid segregation.Conclusion: In conclusion, QIGTD effectively captures the temporal changes in gene networks and identifies critical genes. It provides a valuable tool for studying temporal dynamics in biological networks and can aid in understanding the underlying mechanisms of diseases such as LUAD.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"30"},"PeriodicalIF":4.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11376055/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00386-w","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Identifying critical genes is important for understanding the pathogenesis of complex diseases. Traditional studies typically comparing the change of biomecules between normal and disease samples or detecting important vertices from a single static biomolecular network, which often overlook the dynamic changes that occur between different disease stages. However, investigating temporal changes in biomolecular networks and identifying critical genes is critical for understanding the occurrence and development of diseases.

Methods: A novel method called Quantifying Importance of Genes with Tensor Decomposition (QIGTD) was proposed in this study. It first constructs a time series network by integrating both the intra and inter temporal network information, which preserving connections between networks at adjacent stages according to the local similarities. A tensor is employed to describe the connections of this time series network, and a 3-order tensor decomposition method was proposed to capture both the topological information of each network snapshot and the time series characteristics of the whole network. QIGTD is also a learning-free and efficient method that can be applied to datasets with a small number of samples.

Results: The effectiveness of QIGTD was evaluated using lung adenocarcinoma (LUAD) datasets and three state-of-the-art methods: T-degree, T-closeness, and T-betweenness were employed as benchmark methods. Numerical experimental results demonstrate that QIGTD outperforms these methods in terms of the indices of both precision and mAP. Notably, out of the top 50 genes, 29 have been verified to be highly related to LUAD according to the DisGeNET Database, and 36 are significantly enriched in LUAD related Gene Ontology (GO) terms, including nuclear division, mitotic nuclear division, chromosome segregation, organelle fission, and mitotic sister chromatid segregation.

Conclusion: In conclusion, QIGTD effectively captures the temporal changes in gene networks and identifies critical genes. It provides a valuable tool for studying temporal dynamics in biological networks and can aid in understanding the underlying mechanisms of diseases such as LUAD.

查看原文本刊更多论文

QIGTD：通过张量分解确定肺腺癌演变过程中的关键基因。

背景：识别关键基因对于了解复杂疾病的发病机制非常重要。传统研究通常比较正常样本与疾病样本之间生物分子的变化，或从单一静态生物分子网络中检测重要顶点，这往往忽略了不同疾病阶段之间发生的动态变化。然而，研究生物分子网络的时间变化并确定关键基因对于了解疾病的发生和发展至关重要：方法：本研究提出了一种名为 "张量分解基因重要性量化（QIGTD）"的新方法。它首先通过整合时间内和时间间的网络信息构建时间序列网络，根据局部相似性保留相邻阶段网络之间的连接。采用张量来描述该时间序列网络的连接，并提出了一种三阶张量分解方法，以捕捉每个网络快照的拓扑信息和整个网络的时间序列特征。QIGTD 也是一种无需学习的高效方法，可用于样本数量较少的数据集：使用肺腺癌（LUAD）数据集和三种最先进的方法评估了 QIGTD 的有效性：以 T-degree、T-closeness 和 T-betweenness 作为基准方法。数值实验结果表明，QIGTD 在精确度和 mAP 两项指标上都优于这些方法。值得注意的是，根据 DisGeNET 数据库，在前 50 个基因中，有 29 个已被证实与 LUAD 高度相关，有 36 个显著富集了与 LUAD 相关的基因本体（Gene Ontology，GO）术语，包括核分裂、有丝分裂核分裂、染色体分离、细胞器裂变和有丝分裂姐妹染色单体分离：总之，QIGTD 能有效捕捉基因网络的时间变化并识别关键基因。结论：QIGTD 能有效捕捉基因网络的时间变化并识别关键基因，它为研究生物网络的时间动态提供了一种有价值的工具，有助于了解 LUAD 等疾病的潜在机制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

7.90

自引率

0.00%

发文量

审稿时长

23 weeks

期刊介绍： BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.