Identification of GJC1 as a novel diagnostic marker for papillary thyroid carcinoma using weighted gene co-expression network analysis and machine learning algorithm.
{"title":"Identification of GJC1 as a novel diagnostic marker for papillary thyroid carcinoma using weighted gene co-expression network analysis and machine learning algorithm.","authors":"Jingshu Zhang, Ping Sun","doi":"10.1007/s12672-025-02137-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The incidence of thyroid papillary carcinoma (PTC) is increasing annually, causing both physical and psychological pressure on patients. Therefore, early recognition and specific interventions for PTC are crucial. The objective of this study is to explore novel diagnostic marker and precise intervention targets for PTC.</p><p><strong>Methods: </strong>Based on a weighted gene co-expression network analysis (WGCNA), relevant datasets from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases were collected. Enrichment analysis was performed on differentially expressed genes (DEGs) using Gene Ontology (GO), Disease Ontology (DO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Set Enrichment Analysis (GSEA). Subsequently, three machine learning algorithms Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine Recursive Feature Elimination (SVM-RFE), and Random Forest (RF) were used to identify the core genes. Finally, receiver operating characteristic (ROC) curves were used to analyze the clinical diagnostic value of the core genes.</p><p><strong>Results: </strong>We found, in total, 11,194 DEGs derived the TCGA and GEO datasets, that are primarily enriched in extracellular matrix (ECM) and inflammation related pathways, such as an ECM receptor interaction, cell adhesion molecules (CAMs), Tumor necrosis factor (TNF) signaling, and nucleotide-binding oligomerization domain (NOD) like receptor signaling pathways. Further analysis of the core genes, identified by the protein-protein interaction network, using three machine learning algorithms discovered three intersecting genes GJC1, KLHL4, and NOL4. Of which, GJC1 has good clinical diagnostic ability, which was verified using both the GEO (area under the ROC curve (AUC) = .982) and TCGA databases (AUC = .840).</p><p><strong>Conclusions: </strong>GJC1 is highly expressed in PTC. Therefore, it is considered as a potential biomarker and is expected to become a new target for PTC gene therapy. However, it still needs to be supported and verified by more clinical data.</p>","PeriodicalId":11148,"journal":{"name":"Discover. Oncology","volume":"16 1","pages":"339"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11914436/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Discover. Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12672-025-02137-7","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The incidence of thyroid papillary carcinoma (PTC) is increasing annually, causing both physical and psychological pressure on patients. Therefore, early recognition and specific interventions for PTC are crucial. The objective of this study is to explore novel diagnostic marker and precise intervention targets for PTC.
Methods: Based on a weighted gene co-expression network analysis (WGCNA), relevant datasets from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases were collected. Enrichment analysis was performed on differentially expressed genes (DEGs) using Gene Ontology (GO), Disease Ontology (DO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Set Enrichment Analysis (GSEA). Subsequently, three machine learning algorithms Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine Recursive Feature Elimination (SVM-RFE), and Random Forest (RF) were used to identify the core genes. Finally, receiver operating characteristic (ROC) curves were used to analyze the clinical diagnostic value of the core genes.
Results: We found, in total, 11,194 DEGs derived the TCGA and GEO datasets, that are primarily enriched in extracellular matrix (ECM) and inflammation related pathways, such as an ECM receptor interaction, cell adhesion molecules (CAMs), Tumor necrosis factor (TNF) signaling, and nucleotide-binding oligomerization domain (NOD) like receptor signaling pathways. Further analysis of the core genes, identified by the protein-protein interaction network, using three machine learning algorithms discovered three intersecting genes GJC1, KLHL4, and NOL4. Of which, GJC1 has good clinical diagnostic ability, which was verified using both the GEO (area under the ROC curve (AUC) = .982) and TCGA databases (AUC = .840).
Conclusions: GJC1 is highly expressed in PTC. Therefore, it is considered as a potential biomarker and is expected to become a new target for PTC gene therapy. However, it still needs to be supported and verified by more clinical data.