{"title":"利用TCGA和GEO数据的子宫平滑肌肉瘤和子宫平滑肌瘤的综合分析:WGCNA和机器学习方法。","authors":"Zixin Yang, Fan Yang, Fanlin Li, Ying Zheng","doi":"10.21037/tcr-2024-2465","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Uterine sarcoma is a gynecological mesenchymal tumor with an elusive pathogenesis. The uterine leiomyosarcoma (LMS) is the most common subtype of uterine sarcoma. LMS is a highly aggressive tumor with a poor prognosis. The genomic landscape of LMS remains unclear. Rare cases of LMS are observed to arise from leiomyoma (LM). We conducted a study to explore the genomic relationship between LMS and LM using public microarray data from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA). Using bioinformatics analysis tools, we would like to provide molecular insight into the pathogenesis of LMS and to discover novel predictive biomarkers for this disease.</p><p><strong>Methods: </strong>LMS and LM differentially expressed genes (DEGs) were screened by analyzing GEO datasets; GSE764, GSE68312 and GSE64763; and TCGA data. A protein-protein interaction (PPI) network was constructed, and hub genes were identified utilizing the CytoHubba plug-in from Cytoscape software. In addition, weighted gene co-expression network analysis (WGCNA) was performed to identify hub genes. We took the intersection of the hub genes generated from the PPI network and WGCNA. Subsequently, random forest (RF) and support vector machine (SVM) algorithms were used to screen for key genes as predictive biomarkers. Finally, we constructed a nomogram with these genes.</p><p><strong>Results: </strong>A total of 37 hub genes were selected using WGCNA. A total of 245 DEGs were identified; 63 DEGs were upregulated, and 182 DEGs were downregulated. Functional enrichment analysis revealed that these genes were mainly associated with the cell cycle, extracellular matrix receptor interactions and oocyte meiosis. The final hub genes were <i>CENPA, KIF2C, TTK, MELK</i> and <i>CDC20</i>. Gene set enrichment analysis (GSEA) revealed that these genes were mostly enriched in the cell cycle, mismatch repair and amino sugar and nucleotide sugar metabolism. Tumor-infiltrating immune cell analysis indicated that these genes did not have an obvious correlation with immune cells.</p><p><strong>Conclusions: </strong><i>CENPA, KIF2C, TTK, MELK</i> and <i>CDC20</i> were key genes significantly associated with LMS and LM. Functional enrichment analysis and tumor-infiltrating immune cell analysis indicated that these genes might be correlated with tumor proliferation, which might shed light on the possible pathogenesis and predictive biomarkers of LMS.</p>","PeriodicalId":23216,"journal":{"name":"Translational cancer research","volume":"14 5","pages":"2999-3016"},"PeriodicalIF":1.5000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12170143/pdf/","citationCount":"0","resultStr":"{\"title\":\"Integrated analysis of uterine leiomyosarcoma and leiomyoma utilizing TCGA and GEO data: a WGCNA and machine learning approach.\",\"authors\":\"Zixin Yang, Fan Yang, Fanlin Li, Ying Zheng\",\"doi\":\"10.21037/tcr-2024-2465\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Uterine sarcoma is a gynecological mesenchymal tumor with an elusive pathogenesis. The uterine leiomyosarcoma (LMS) is the most common subtype of uterine sarcoma. LMS is a highly aggressive tumor with a poor prognosis. The genomic landscape of LMS remains unclear. Rare cases of LMS are observed to arise from leiomyoma (LM). We conducted a study to explore the genomic relationship between LMS and LM using public microarray data from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA). Using bioinformatics analysis tools, we would like to provide molecular insight into the pathogenesis of LMS and to discover novel predictive biomarkers for this disease.</p><p><strong>Methods: </strong>LMS and LM differentially expressed genes (DEGs) were screened by analyzing GEO datasets; GSE764, GSE68312 and GSE64763; and TCGA data. A protein-protein interaction (PPI) network was constructed, and hub genes were identified utilizing the CytoHubba plug-in from Cytoscape software. In addition, weighted gene co-expression network analysis (WGCNA) was performed to identify hub genes. We took the intersection of the hub genes generated from the PPI network and WGCNA. Subsequently, random forest (RF) and support vector machine (SVM) algorithms were used to screen for key genes as predictive biomarkers. Finally, we constructed a nomogram with these genes.</p><p><strong>Results: </strong>A total of 37 hub genes were selected using WGCNA. A total of 245 DEGs were identified; 63 DEGs were upregulated, and 182 DEGs were downregulated. Functional enrichment analysis revealed that these genes were mainly associated with the cell cycle, extracellular matrix receptor interactions and oocyte meiosis. The final hub genes were <i>CENPA, KIF2C, TTK, MELK</i> and <i>CDC20</i>. Gene set enrichment analysis (GSEA) revealed that these genes were mostly enriched in the cell cycle, mismatch repair and amino sugar and nucleotide sugar metabolism. Tumor-infiltrating immune cell analysis indicated that these genes did not have an obvious correlation with immune cells.</p><p><strong>Conclusions: </strong><i>CENPA, KIF2C, TTK, MELK</i> and <i>CDC20</i> were key genes significantly associated with LMS and LM. Functional enrichment analysis and tumor-infiltrating immune cell analysis indicated that these genes might be correlated with tumor proliferation, which might shed light on the possible pathogenesis and predictive biomarkers of LMS.</p>\",\"PeriodicalId\":23216,\"journal\":{\"name\":\"Translational cancer research\",\"volume\":\"14 5\",\"pages\":\"2999-3016\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12170143/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Translational cancer research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.21037/tcr-2024-2465\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Translational cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/tcr-2024-2465","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/13 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景:子宫肉瘤是一种病因不明的妇科间质肿瘤。子宫平滑肌肉瘤(LMS)是子宫肉瘤最常见的亚型。LMS是一种高侵袭性肿瘤,预后差。LMS的基因组图谱尚不清楚。罕见的LMS是由平滑肌瘤(LM)引起的。我们利用基因表达图谱(Gene Expression Omnibus, GEO)和癌症基因组图谱(the Cancer Genome Atlas, TCGA)的公共微阵列数据,对LMS和LM之间的基因组关系进行了研究。利用生物信息学分析工具,我们希望为LMS的发病机制提供分子洞察,并发现这种疾病的新的预测性生物标志物。方法:通过分析GEO数据集筛选LMS和LM差异表达基因(DEGs);GSE764、GSE68312、GSE64763;和TCGA数据。构建蛋白-蛋白相互作用(PPI)网络,利用Cytoscape软件中的CytoHubba插件对枢纽基因进行鉴定。此外,采用加权基因共表达网络分析(WGCNA)鉴定中心基因。我们取了PPI网络和WGCNA产生的枢纽基因的交集。随后,使用随机森林(RF)和支持向量机(SVM)算法筛选关键基因作为预测性生物标志物。最后,我们用这些基因构建了一个nomogram。结果:WGCNA共筛选到37个枢纽基因。共鉴定245个deg;63个基因表达上调,182个基因表达下调。功能富集分析显示,这些基因主要与细胞周期、细胞外基质受体相互作用和卵母细胞减数分裂有关。最终中心基因为CENPA、KIF2C、TTK、MELK和CDC20。基因集富集分析(GSEA)显示,这些基因主要富集于细胞周期、错配修复和氨基糖和核苷酸糖代谢。肿瘤浸润性免疫细胞分析表明,这些基因与免疫细胞无明显相关性。结论:CENPA、KIF2C、TTK、MELK和CDC20是与LMS和LM显著相关的关键基因。功能富集分析和肿瘤浸润免疫细胞分析表明,这些基因可能与肿瘤增殖有关,这可能为LMS的发病机制和预测生物标志物提供线索。
Integrated analysis of uterine leiomyosarcoma and leiomyoma utilizing TCGA and GEO data: a WGCNA and machine learning approach.
Background: Uterine sarcoma is a gynecological mesenchymal tumor with an elusive pathogenesis. The uterine leiomyosarcoma (LMS) is the most common subtype of uterine sarcoma. LMS is a highly aggressive tumor with a poor prognosis. The genomic landscape of LMS remains unclear. Rare cases of LMS are observed to arise from leiomyoma (LM). We conducted a study to explore the genomic relationship between LMS and LM using public microarray data from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA). Using bioinformatics analysis tools, we would like to provide molecular insight into the pathogenesis of LMS and to discover novel predictive biomarkers for this disease.
Methods: LMS and LM differentially expressed genes (DEGs) were screened by analyzing GEO datasets; GSE764, GSE68312 and GSE64763; and TCGA data. A protein-protein interaction (PPI) network was constructed, and hub genes were identified utilizing the CytoHubba plug-in from Cytoscape software. In addition, weighted gene co-expression network analysis (WGCNA) was performed to identify hub genes. We took the intersection of the hub genes generated from the PPI network and WGCNA. Subsequently, random forest (RF) and support vector machine (SVM) algorithms were used to screen for key genes as predictive biomarkers. Finally, we constructed a nomogram with these genes.
Results: A total of 37 hub genes were selected using WGCNA. A total of 245 DEGs were identified; 63 DEGs were upregulated, and 182 DEGs were downregulated. Functional enrichment analysis revealed that these genes were mainly associated with the cell cycle, extracellular matrix receptor interactions and oocyte meiosis. The final hub genes were CENPA, KIF2C, TTK, MELK and CDC20. Gene set enrichment analysis (GSEA) revealed that these genes were mostly enriched in the cell cycle, mismatch repair and amino sugar and nucleotide sugar metabolism. Tumor-infiltrating immune cell analysis indicated that these genes did not have an obvious correlation with immune cells.
Conclusions: CENPA, KIF2C, TTK, MELK and CDC20 were key genes significantly associated with LMS and LM. Functional enrichment analysis and tumor-infiltrating immune cell analysis indicated that these genes might be correlated with tumor proliferation, which might shed light on the possible pathogenesis and predictive biomarkers of LMS.
期刊介绍:
Translational Cancer Research (Transl Cancer Res TCR; Print ISSN: 2218-676X; Online ISSN 2219-6803; http://tcr.amegroups.com/) is an Open Access, peer-reviewed journal, indexed in Science Citation Index Expanded (SCIE). TCR publishes laboratory studies of novel therapeutic interventions as well as clinical trials which evaluate new treatment paradigms for cancer; results of novel research investigations which bridge the laboratory and clinical settings including risk assessment, cellular and molecular characterization, prevention, detection, diagnosis and treatment of human cancers with the overall goal of improving the clinical care of cancer patients. The focus of TCR is original, peer-reviewed, science-based research that successfully advances clinical medicine toward the goal of improving patients'' quality of life. The editors and an international advisory group of scientists and clinician-scientists as well as other experts will hold TCR articles to the high-quality standards. We accept Original Articles as well as Review Articles, Editorials and Brief Articles.