{"title":"Reclassifying NOBOX variants in primary ovarian insufficiency cases with a corrected gene model and a novel quantitative framework","authors":"Reiner A Veitia, Jamie D Cowles, Sandrine Caburet","doi":"10.1093/humrep/deaf058","DOIUrl":null,"url":null,"abstract":"STUDY QUESTION How updated expression and genomic data combined with a disease/disorder-specific classification system can be used to correct a gene model for a better evaluation of the pathogenicity of variants found in patients? SUMMARY ANSWER By combining available genomic and transcriptomic data from several species and a quantitative classification framework with primary ovarian insufficiency (POI)-adjusted parameters, we correct the human NOBOX (newborn ovary homeobox) gene model and provide a reclassification of variants previously reported in POI cases. WHAT IS KNOWN ALREADY The NOBOX gene, encoding a gonad-specific transcription factor with a crucial role in early folliculogenesis and considered a major gene involved in POI, is currently described as being expressed as four transcripts, the longest one considered canonical. All the variants identified in POI cases have been evaluated according to this canonical transcript, and the various functional tests have been performed using the corresponding predicted protein. STUDY DESIGN, SIZE, DURATION We refined and corrected the NOBOX gene model using available genomic and RNAseq data in human and 16 other mammalian species. Expression data were selected for tissue specificity, strand specificity, and coverage. The analysis of RNAseq data from different ovarian fetal stages allows for a time-course description of NOBOX isoforms. Literature was scanned to retrieve NOBOX variants reported in POI cases, and NOBOX variants present in ClinVar and GnomAD 4 databases were also retrieved. PARTICIPANTS/MATERIALS, SETTING, METHODS Strand-specific RNAseq data from human fetal ovaries and human adult testes were analysed to infer the correct human NOBOX gene isoforms. The conservation of the gene structure was verified by combining the aligned genomic sequences from 17 mammalian species covering a wide phylogenetic range and the relevant RNAseq data. As changing a gene model implies a reclassification of variants, we set up a quantitative framework with updated variant frequencies from GnomAD4 and POI-adjusted parameters following the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) guidelines. Using this framework, we reclassified 44 NOBOX variants reported in POI patients and families, 117 NOBOX variants reported in ClinVar, and 2613 NOBOX variants present in GnomAD4. MAIN RESULTS AND THE ROLE OF CHANCE The corrected NOBOX gene model proposes the invalidation of two transcripts, including the canonical one. The two correct isoforms were present in fetal ovarian samples, and only one was detected in adult testes. Only 14 variants remained as possibly causative for POI. Furthermore, this re-evaluation strongly suggests that NOBOX biallelic variants are the most likely cause of POI. LARGE SCALE DATA Large tables are provided as supplementary data sets on the Zenodo repository. LIMITATIONS, REASONS FOR CAUTION The proposed gene model is robust but relies on available transcriptomic data covering a range of time points and tissues. Our scoring system was manually adjusted and other laboratories can implement it with different parameters. WIDER IMPLICATIONS OF THE FINDINGS For the NOBOX variants that cannot be considered pathogenic or causative anymore, the genome/exome sequencing data of the corresponding patients should be reanalysed. Furthermore, the functional studies performed using the obsolete coding sequence should be reconsidered. The corrected gene model should be taken into account when evaluating novel NOBOX variants identified in POI patients. Our results highlight the importance of the careful assessment of the most updated expression data for validating a gene model, enabling a correct evaluation of the pathogenicity of variants found in patients. The proposed quantitative framework developed here can be used for the classification of variants in other genes underlying POI. Furthermore, the global approach based on quantitatively adjusting the ACMG/AMP guidelines could be extended to other inherited pathologies. STUDY FUNDING/COMPETING INTEREST(S) This project was not funded. All the authors have no conflict of interest to disclose.","PeriodicalId":13003,"journal":{"name":"Human reproduction","volume":"17 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human reproduction","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/humrep/deaf058","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
STUDY QUESTION How updated expression and genomic data combined with a disease/disorder-specific classification system can be used to correct a gene model for a better evaluation of the pathogenicity of variants found in patients? SUMMARY ANSWER By combining available genomic and transcriptomic data from several species and a quantitative classification framework with primary ovarian insufficiency (POI)-adjusted parameters, we correct the human NOBOX (newborn ovary homeobox) gene model and provide a reclassification of variants previously reported in POI cases. WHAT IS KNOWN ALREADY The NOBOX gene, encoding a gonad-specific transcription factor with a crucial role in early folliculogenesis and considered a major gene involved in POI, is currently described as being expressed as four transcripts, the longest one considered canonical. All the variants identified in POI cases have been evaluated according to this canonical transcript, and the various functional tests have been performed using the corresponding predicted protein. STUDY DESIGN, SIZE, DURATION We refined and corrected the NOBOX gene model using available genomic and RNAseq data in human and 16 other mammalian species. Expression data were selected for tissue specificity, strand specificity, and coverage. The analysis of RNAseq data from different ovarian fetal stages allows for a time-course description of NOBOX isoforms. Literature was scanned to retrieve NOBOX variants reported in POI cases, and NOBOX variants present in ClinVar and GnomAD 4 databases were also retrieved. PARTICIPANTS/MATERIALS, SETTING, METHODS Strand-specific RNAseq data from human fetal ovaries and human adult testes were analysed to infer the correct human NOBOX gene isoforms. The conservation of the gene structure was verified by combining the aligned genomic sequences from 17 mammalian species covering a wide phylogenetic range and the relevant RNAseq data. As changing a gene model implies a reclassification of variants, we set up a quantitative framework with updated variant frequencies from GnomAD4 and POI-adjusted parameters following the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) guidelines. Using this framework, we reclassified 44 NOBOX variants reported in POI patients and families, 117 NOBOX variants reported in ClinVar, and 2613 NOBOX variants present in GnomAD4. MAIN RESULTS AND THE ROLE OF CHANCE The corrected NOBOX gene model proposes the invalidation of two transcripts, including the canonical one. The two correct isoforms were present in fetal ovarian samples, and only one was detected in adult testes. Only 14 variants remained as possibly causative for POI. Furthermore, this re-evaluation strongly suggests that NOBOX biallelic variants are the most likely cause of POI. LARGE SCALE DATA Large tables are provided as supplementary data sets on the Zenodo repository. LIMITATIONS, REASONS FOR CAUTION The proposed gene model is robust but relies on available transcriptomic data covering a range of time points and tissues. Our scoring system was manually adjusted and other laboratories can implement it with different parameters. WIDER IMPLICATIONS OF THE FINDINGS For the NOBOX variants that cannot be considered pathogenic or causative anymore, the genome/exome sequencing data of the corresponding patients should be reanalysed. Furthermore, the functional studies performed using the obsolete coding sequence should be reconsidered. The corrected gene model should be taken into account when evaluating novel NOBOX variants identified in POI patients. Our results highlight the importance of the careful assessment of the most updated expression data for validating a gene model, enabling a correct evaluation of the pathogenicity of variants found in patients. The proposed quantitative framework developed here can be used for the classification of variants in other genes underlying POI. Furthermore, the global approach based on quantitatively adjusting the ACMG/AMP guidelines could be extended to other inherited pathologies. STUDY FUNDING/COMPETING INTEREST(S) This project was not funded. All the authors have no conflict of interest to disclose.
期刊介绍:
Human Reproduction features full-length, peer-reviewed papers reporting original research, concise clinical case reports, as well as opinions and debates on topical issues.
Papers published cover the clinical science and medical aspects of reproductive physiology, pathology and endocrinology; including andrology, gonad function, gametogenesis, fertilization, embryo development, implantation, early pregnancy, genetics, genetic diagnosis, oncology, infectious disease, surgery, contraception, infertility treatment, psychology, ethics and social issues.