Development of machine learning models for diagnostic biomarker identification and immune cell infiltration analysis in PCOS.

IF 3.8 3区 医学 Q1 REPRODUCTIVE BIOLOGY
Wenxiu Chen, Jianliang Miao, Jingfei Chen, Jianlin Chen
{"title":"Development of machine learning models for diagnostic biomarker identification and immune cell infiltration analysis in PCOS.","authors":"Wenxiu Chen, Jianliang Miao, Jingfei Chen, Jianlin Chen","doi":"10.1186/s13048-024-01583-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Polycystic ovary syndrome (PCOS) is a common endocrine disorder affecting women of reproductive age. It is characterized by symptoms such as hyperandrogenemia, oligo or anovulation and polycystic ovarian, significantly impacting quality of life. However, the practical implementation of machine learning (ML) in PCOS diagnosis is hindered by the limitations related to data size and algorithmic models. To address this research gap, we have increased the sample size in our study and aim to utilize two ML algorithms to analyze and validate diagnostic biomarkers, as well as explore immune cell infiltration patterns in PCOS.</p><p><strong>Methods: </strong>We performed RNA-seq analysis on granulosa cell, including 13 samples from normal controls and 25 samples from women with PCOS. The data from our study were combined with publicly available databases. Batch effects were corrected using the 'sva' package in R software. Differential expression analysis was performed to identify genes that exhibited significant differences between the two groups. These differentially expressed genes (DEGs) were further analyzed for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Hub genes were selected by intersecting the results of both methods after using LASSO and SVM-RFE for central gene selection for DEGs. Receiver Operating Characteristic (ROC) curves were employed to verify the accuracy of models by SVM and XGBoost. CIBERSORT analysis was performed to determine the relative abundances of immune cell populations. GSEA was analyzed to illustrate the expression patterns of genes within highly enriched functional pathways. RT-qPCR was used to validate the reliability of hub genes.</p><p><strong>Results: </strong>824 DEGs were found between the normal control and PCOS groups, including 376 upregulated and 448 downregulated genes. These DEGs were associated with endocytosis, salmonella infection and focal adhesion based on the KEGG enrichment analysis. Through overlapping LASSO and SVM-RFE algorithms, we identified four hub genes (CNTN2, CASR, CACNB3, MFAP2) that are significantly associated with the PCOS group. The diagnostic efficacy validation set using SVM and XGBoost yielded AUC values of 0.795 and 0.875, respectively, indicating their potential as diagnostic biomarkers. Consistent with the data analysis, the upregulation of CNTN2, CASR, CACNB3, and MFAP2 in PCOS was confirmed by RT-qPCR analysis on human granulosa cells. Furthermore, according to CIBERSORT analysis, a significant reduction in CD4 memory resting T cells was revealed in the PCOS group compared to the normal control group (P < 0.05).</p><p><strong>Conclusions: </strong>This study identified CNTN2, CASR, CACNB3, and MFAP2 as potential diagnostic biomarkers for PCOS, which provides strong evidence for existing research on hub genes. Furthermore, the analysis of immune cell infiltration revealed the significant involvement of CD4 memory resting T cells in the onset and progression of PCOS. These findings shed light on potential mechanisms underlying PCOS pathogenesis and provide valuable insights for future research and therapeutic interventions.</p>","PeriodicalId":16610,"journal":{"name":"Journal of Ovarian Research","volume":"18 1","pages":"1"},"PeriodicalIF":3.8000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11697806/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Ovarian Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13048-024-01583-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REPRODUCTIVE BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Polycystic ovary syndrome (PCOS) is a common endocrine disorder affecting women of reproductive age. It is characterized by symptoms such as hyperandrogenemia, oligo or anovulation and polycystic ovarian, significantly impacting quality of life. However, the practical implementation of machine learning (ML) in PCOS diagnosis is hindered by the limitations related to data size and algorithmic models. To address this research gap, we have increased the sample size in our study and aim to utilize two ML algorithms to analyze and validate diagnostic biomarkers, as well as explore immune cell infiltration patterns in PCOS.

Methods: We performed RNA-seq analysis on granulosa cell, including 13 samples from normal controls and 25 samples from women with PCOS. The data from our study were combined with publicly available databases. Batch effects were corrected using the 'sva' package in R software. Differential expression analysis was performed to identify genes that exhibited significant differences between the two groups. These differentially expressed genes (DEGs) were further analyzed for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Hub genes were selected by intersecting the results of both methods after using LASSO and SVM-RFE for central gene selection for DEGs. Receiver Operating Characteristic (ROC) curves were employed to verify the accuracy of models by SVM and XGBoost. CIBERSORT analysis was performed to determine the relative abundances of immune cell populations. GSEA was analyzed to illustrate the expression patterns of genes within highly enriched functional pathways. RT-qPCR was used to validate the reliability of hub genes.

Results: 824 DEGs were found between the normal control and PCOS groups, including 376 upregulated and 448 downregulated genes. These DEGs were associated with endocytosis, salmonella infection and focal adhesion based on the KEGG enrichment analysis. Through overlapping LASSO and SVM-RFE algorithms, we identified four hub genes (CNTN2, CASR, CACNB3, MFAP2) that are significantly associated with the PCOS group. The diagnostic efficacy validation set using SVM and XGBoost yielded AUC values of 0.795 and 0.875, respectively, indicating their potential as diagnostic biomarkers. Consistent with the data analysis, the upregulation of CNTN2, CASR, CACNB3, and MFAP2 in PCOS was confirmed by RT-qPCR analysis on human granulosa cells. Furthermore, according to CIBERSORT analysis, a significant reduction in CD4 memory resting T cells was revealed in the PCOS group compared to the normal control group (P < 0.05).

Conclusions: This study identified CNTN2, CASR, CACNB3, and MFAP2 as potential diagnostic biomarkers for PCOS, which provides strong evidence for existing research on hub genes. Furthermore, the analysis of immune cell infiltration revealed the significant involvement of CD4 memory resting T cells in the onset and progression of PCOS. These findings shed light on potential mechanisms underlying PCOS pathogenesis and provide valuable insights for future research and therapeutic interventions.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Ovarian Research
Journal of Ovarian Research REPRODUCTIVE BIOLOGY-
CiteScore
6.20
自引率
2.50%
发文量
125
审稿时长
>12 weeks
期刊介绍: Journal of Ovarian Research is an open access, peer reviewed, online journal that aims to provide a forum for high-quality basic and clinical research on ovarian function, abnormalities, and cancer. The journal focuses on research that provides new insights into ovarian functions as well as prevention and treatment of diseases afflicting the organ. Topical areas include, but are not restricted to: Ovary development, hormone secretion and regulation Follicle growth and ovulation Infertility and Polycystic ovarian syndrome Regulation of pituitary and other biological functions by ovarian hormones Ovarian cancer, its prevention, diagnosis and treatment Drug development and screening Role of stem cells in ovary development and function.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信