Integrative transcriptomics and single-cell transcriptomics analyses reveal potential biomarkers and mechanisms of action in papillary thyroid carcinoma.
{"title":"Integrative transcriptomics and single-cell transcriptomics analyses reveal potential biomarkers and mechanisms of action in papillary thyroid carcinoma.","authors":"Wanchen Cao, Kai Gao, Yi Zhao","doi":"10.3389/fgene.2025.1536198","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Papillary thyroid carcinoma (PTC) has a high recurrence rate and lacks reliable diagnostic biomarkers. This study aims to identify robust transcriptomic biomarkers for PTC diagnosis through integrative bioinformatics approaches and elucidate the cellular mechanisms underlying PTC pathogenesis at single-cell resolution.</p><p><strong>Methods: </strong>Based on the Gene Expression Omnibus (GEO) database, we downloaded PTC-related RNA-seq datasets (GSE3467, GSE3678, GSE33630, GSE65144, and GSE82208) and an scRNA-seq dataset (GSE191288). Among these, the RNA-seq dataset GSE3467 was used as the training dataset to perform differential gene expression analysis, GO and KEGG enrichment analyses, weighted gene co-expression network analysis (WGCNA), machine learning, ROC analysis, nomogram analysis, and GSEA for mining potential biomarkers. The remaining RNA-seq datasets (GSE3678, GSE33630, GSE65144, and GSE82208) were used as the validation datasets to validate these potential biomarkers. Based on the results from potential biomarker mining, the scRNA-seq dataset (GSE191288) was used to analyze and uncover key cell types and their mechanisms involved in the occurrence and development of PTC.</p><p><strong>Results: </strong>This study retrieved relevant PTC datasets from the GEO database and identified three biomarkers (ENTPD1, SERPINA1, and TACSTD2) through a series of bioinformatics analyses. GSEA suggested that these biomarkers may be involved in the occurrence and development of PTC by collectively regulating the cytokine-cytokine receptor interaction pathways. scRNA-seq analysis revealed tissue stem cells, epithelial cells, and smooth muscle cells as key cell types in PTC. Cell-cell communication analysis revealed that epithelial cells primarily interact with tissue stem cells and smooth muscle cells through two ligand-receptor pairs, namely, COL4A1-CD4 and COL4A2-CD4. The collagen signaling pathway was identified as the most dominant pathway, and violin plots demonstrated that ligands COL4A1 and COL4A2 were highly expressed in epithelial cells, while the receptor CD4 showed elevated expression in both tissue stem cells and smooth muscle cells. Pseudotime analysis demonstrated that these three cell types underwent three distinct differentiation stages, during which the expression levels of the biomarkers ENTPD1, SERPINA1, and TACSTD2 showed stage-specific trends.</p><p><strong>Conclusion: </strong>In summary, this study combines RNA-seq and scRNA-seq analysis techniques to identify ENTPD1, SERPINA1, and TACSTD2 as potential biomarkers for PTC at the transcriptomic level and tissue stem cells, epithelial cells, and smooth muscle cells as key cells in PTC at the cellular level. This study conducted in-depth research and analysis on these potential biomarkers and key cells, providing new research foundations and insights for future basic experimental research and the diagnosis and treatment of PTC in clinical settings.</p>","PeriodicalId":12750,"journal":{"name":"Frontiers in Genetics","volume":"16 ","pages":"1536198"},"PeriodicalIF":2.8000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12162626/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3389/fgene.2025.1536198","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: Papillary thyroid carcinoma (PTC) has a high recurrence rate and lacks reliable diagnostic biomarkers. This study aims to identify robust transcriptomic biomarkers for PTC diagnosis through integrative bioinformatics approaches and elucidate the cellular mechanisms underlying PTC pathogenesis at single-cell resolution.
Methods: Based on the Gene Expression Omnibus (GEO) database, we downloaded PTC-related RNA-seq datasets (GSE3467, GSE3678, GSE33630, GSE65144, and GSE82208) and an scRNA-seq dataset (GSE191288). Among these, the RNA-seq dataset GSE3467 was used as the training dataset to perform differential gene expression analysis, GO and KEGG enrichment analyses, weighted gene co-expression network analysis (WGCNA), machine learning, ROC analysis, nomogram analysis, and GSEA for mining potential biomarkers. The remaining RNA-seq datasets (GSE3678, GSE33630, GSE65144, and GSE82208) were used as the validation datasets to validate these potential biomarkers. Based on the results from potential biomarker mining, the scRNA-seq dataset (GSE191288) was used to analyze and uncover key cell types and their mechanisms involved in the occurrence and development of PTC.
Results: This study retrieved relevant PTC datasets from the GEO database and identified three biomarkers (ENTPD1, SERPINA1, and TACSTD2) through a series of bioinformatics analyses. GSEA suggested that these biomarkers may be involved in the occurrence and development of PTC by collectively regulating the cytokine-cytokine receptor interaction pathways. scRNA-seq analysis revealed tissue stem cells, epithelial cells, and smooth muscle cells as key cell types in PTC. Cell-cell communication analysis revealed that epithelial cells primarily interact with tissue stem cells and smooth muscle cells through two ligand-receptor pairs, namely, COL4A1-CD4 and COL4A2-CD4. The collagen signaling pathway was identified as the most dominant pathway, and violin plots demonstrated that ligands COL4A1 and COL4A2 were highly expressed in epithelial cells, while the receptor CD4 showed elevated expression in both tissue stem cells and smooth muscle cells. Pseudotime analysis demonstrated that these three cell types underwent three distinct differentiation stages, during which the expression levels of the biomarkers ENTPD1, SERPINA1, and TACSTD2 showed stage-specific trends.
Conclusion: In summary, this study combines RNA-seq and scRNA-seq analysis techniques to identify ENTPD1, SERPINA1, and TACSTD2 as potential biomarkers for PTC at the transcriptomic level and tissue stem cells, epithelial cells, and smooth muscle cells as key cells in PTC at the cellular level. This study conducted in-depth research and analysis on these potential biomarkers and key cells, providing new research foundations and insights for future basic experimental research and the diagnosis and treatment of PTC in clinical settings.
目的:甲状腺乳头状癌(PTC)复发率高,缺乏可靠的诊断生物标志物。本研究旨在通过综合生物信息学方法鉴定PTC诊断的强大转录组生物标志物,并在单细胞分辨率上阐明PTC发病机制的细胞机制。方法:基于Gene Expression Omnibus (GEO)数据库下载ptc相关RNA-seq数据集(GSE3467、GSE3678、GSE33630、GSE65144和GSE82208)和scRNA-seq数据集(GSE191288)。其中,RNA-seq数据集GSE3467作为训练数据集,进行差异基因表达分析、GO和KEGG富集分析、加权基因共表达网络分析(WGCNA)、机器学习、ROC分析、nomogram分析和GSEA挖掘潜在生物标志物。剩余的RNA-seq数据集(GSE3678、GSE33630、GSE65144和GSE82208)被用作验证数据集来验证这些潜在的生物标志物。基于潜在生物标志物挖掘的结果,使用scRNA-seq数据集(GSE191288)分析和揭示了参与PTC发生和发展的关键细胞类型及其机制。结果:本研究从GEO数据库中检索了相关的PTC数据集,并通过一系列生物信息学分析鉴定出三个生物标志物(ENTPD1、SERPINA1和TACSTD2)。GSEA提示这些生物标志物可能通过共同调节细胞因子-细胞因子受体相互作用途径参与PTC的发生和发展。scRNA-seq分析显示,组织干细胞、上皮细胞和平滑肌细胞是PTC的关键细胞类型。细胞间通讯分析显示上皮细胞主要通过COL4A1-CD4和COL4A2-CD4两对配体受体对与组织干细胞和平滑肌细胞相互作用。胶原信号通路被确定为最主要的通路,小提琴图显示配体COL4A1和COL4A2在上皮细胞中高表达,而受体CD4在组织干细胞和平滑肌细胞中均表达升高。伪时间分析表明,这三种细胞类型经历了三个不同的分化阶段,在此期间,生物标志物ENTPD1、SERPINA1和TACSTD2的表达水平呈现出阶段特异性趋势。结论:综上所述,本研究结合RNA-seq和scRNA-seq分析技术,在转录组水平鉴定出ENTPD1、SERPINA1和TACSTD2是PTC潜在的生物标志物,在细胞水平鉴定出组织干细胞、上皮细胞和平滑肌细胞是PTC的关键细胞。本研究对这些潜在的生物标志物和关键细胞进行了深入的研究和分析,为今后PTC的基础实验研究和临床诊疗提供了新的研究基础和见解。
Frontiers in GeneticsBiochemistry, Genetics and Molecular Biology-Molecular Medicine
CiteScore
5.50
自引率
8.10%
发文量
3491
审稿时长
14 weeks
期刊介绍:
Frontiers in Genetics publishes rigorously peer-reviewed research on genes and genomes relating to all the domains of life, from humans to plants to livestock and other model organisms. Led by an outstanding Editorial Board of the world’s leading experts, this multidisciplinary, open-access journal is at the forefront of communicating cutting-edge research to researchers, academics, clinicians, policy makers and the public.
The study of inheritance and the impact of the genome on various biological processes is well documented. However, the majority of discoveries are still to come. A new era is seeing major developments in the function and variability of the genome, the use of genetic and genomic tools and the analysis of the genetic basis of various biological phenomena.