PLoS GeneticsPub Date : 2025-01-13eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011552
Satya Dev Polisetty, Krishna Bhat, Kuladeep Das, Ivan Clark, Kevin G Hardwick, Kaustuv Sanyal
{"title":"The dependence of shugoshin on Bub1-kinase activity is dispensable for the maintenance of spindle assembly checkpoint response in Cryptococcus neoformans.","authors":"Satya Dev Polisetty, Krishna Bhat, Kuladeep Das, Ivan Clark, Kevin G Hardwick, Kaustuv Sanyal","doi":"10.1371/journal.pgen.1011552","DOIUrl":"10.1371/journal.pgen.1011552","url":null,"abstract":"<p><p>During chromosome segregation, the spindle assembly checkpoint (SAC) detects errors in kinetochore-microtubule attachments. Timely activation and maintenance of the SAC until defects are corrected is essential for genome stability. Here, we show that shugoshin (Sgo1), a conserved tension-sensing protein, ensures the maintenance of SAC signals in response to unattached kinetochores during mitosis in a basidiomycete budding yeast Cryptococcus neoformans. Sgo1 maintains optimum levels of Aurora B kinase Ipl1 and protein phosphatase 1 (PP1) at kinetochores. The absence of Sgo1 results in the loss of Aurora BIpl1 with a concomitant increase in PP1 levels at kinetochores. This leads to a premature reduction in the kinetochore-bound Bub1 levels and early termination of the SAC signals. Intriguingly, the kinase function of Bub1 is dispensable for shugoshin's subcellular localization. Sgo1 is predominantly localized to spindle pole bodies (SPBs) and along the mitotic spindle with a minor pool at kinetochores. In the absence of proper kinetochore-microtubule attachments, Sgo1 reinforces the Aurora B kinaseIpl1-PP1 phosphatase balance, which is critical for prolonged maintenance of the SAC response.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011552"},"PeriodicalIF":4.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11774493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142979592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-13eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011355
Jesús R Curt, Paloma Martín, David Foronda, Bruno Hudry, Ramakrishnan Kannan, Srividya Shetty, Samir Merabet, Andrew J Saurin, Yacine Graba, Ernesto Sánchez-Herrero
{"title":"Ambivalent partnership of the Drosophila posterior class Hox protein Abdominal-B with Extradenticle and Homothorax.","authors":"Jesús R Curt, Paloma Martín, David Foronda, Bruno Hudry, Ramakrishnan Kannan, Srividya Shetty, Samir Merabet, Andrew J Saurin, Yacine Graba, Ernesto Sánchez-Herrero","doi":"10.1371/journal.pgen.1011355","DOIUrl":"10.1371/journal.pgen.1011355","url":null,"abstract":"<p><p>Hox proteins, a sub-group of the homeodomain (HD) transcription factor family, provide positional information for axial patterning in development and evolution. Hox protein functional specificity is reached, at least in part, through interactions with Pbc (Extradenticle (Exd) in Drosophila) and Meis/Prep (Homothorax (Hth) in Drosophila) proteins. Most of our current knowledge of Hox protein specificity stems from the study of anterior and central Hox proteins, identifying the molecular and structural bases for Hox/Pbc/Meis-Prep cooperative action. Posterior Hox class proteins, Abdominal-B (Abd-B) in Drosophila and Hox9-13 in vertebrates, have been comparatively less studied. They strongly diverge from anterior and central class Hox proteins, with a low degree of HD sequence conservation and the absence of a core canonical Pbc interaction motif. Here we explore how Abd-B function interface with that of Exd/Hth using several developmental contexts, studying mutual expression control, functional dependency and intrinsic protein requirements. Results identify cross-regulatory interactions setting relative expression and activity levels required for proper development. They also reveal organ-specific requirement and a binary functional interplay with Exd and Hth, either antagonistic, as previously reported, or synergistic. This highlights context specific use of Exd/Hth, and a similar context specific use of Abd-B intrinsic protein requirements.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011355"},"PeriodicalIF":4.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11759358/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142980303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-10eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011553
Annaïg De Walsche, Alexis Vergne, Renaud Rincent, Fabrice Roux, Stéphane Nicolas, Claude Welcker, Sofiane Mezmouk, Alain Charcosset, Tristan Mary-Huard
{"title":"metaGE: Investigating genotype x environment interactions through GWAS meta-analysis.","authors":"Annaïg De Walsche, Alexis Vergne, Renaud Rincent, Fabrice Roux, Stéphane Nicolas, Claude Welcker, Sofiane Mezmouk, Alain Charcosset, Tristan Mary-Huard","doi":"10.1371/journal.pgen.1011553","DOIUrl":"10.1371/journal.pgen.1011553","url":null,"abstract":"<p><p>Elucidating the genetic components of plant genotype-by-environment interactions is of key importance in the context of increasing climatic instability, diversification of agricultural practices and pest pressure due to phytosanitary treatment limitations. The genotypic response to environmental stresses can be investigated through multi-environment trials (METs). However, genome-wide association studies (GWAS) of MET data are significantly more complex than that of single environments. In this context, we introduce metaGE, a flexible and computationally efficient meta-analysis approach for jointly analyzing single-environment GWAS of any MET experiment. The metaGE procedure accounts for the heterogeneity of quantitative trait loci (QTL) effects across the environmental conditions and allows the detection of QTL whose allelic effect variations are strongly correlated to environmental cofactors. We evaluated the performance of the proposed methodology and compared it to two competing procedures through simulations. We also applied metaGE to two emblematic examples: the detection of flowering QTLs whose effects are modulated by competition in Arabidopsis and the detection of yield QTLs impacted by drought stresses in maize. The procedure identified known and new QTLs, providing valuable insights into the genetic architecture of complex traits and QTL effects dependent on environmental stress conditions. The whole statistical approach is available as an R package.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011553"},"PeriodicalIF":4.0,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11756807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142962534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-10eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011563
Takintayo Akinbiyi, Mary Sara McPeek, Mark Abney
{"title":"ADELLE: A global testing method for trans-eQTL mapping.","authors":"Takintayo Akinbiyi, Mary Sara McPeek, Mark Abney","doi":"10.1371/journal.pgen.1011563","DOIUrl":"10.1371/journal.pgen.1011563","url":null,"abstract":"<p><p>Understanding the genetic regulatory mechanisms of gene expression is an ongoing challenge. Genetic variants that are associated with expression levels are readily identified when they are proximal to the gene (i.e., cis-eQTLs), but SNPs distant from the gene whose expression levels they are associated with (i.e., trans-eQTLs) have been much more difficult to discover, even though they account for a majority of the heritability in gene expression levels. A major impediment to the identification of more trans-eQTLs is the lack of statistical methods that are powerful enough to overcome the obstacles of small effect sizes and large multiple testing burden of trans-eQTL mapping. Here, we propose ADELLE, a powerful statistical testing framework that requires only summary statistics and is designed to be most sensitive to SNPs that are associated with multiple gene expression levels, a characteristic of many trans-eQTLs. In simulations, we show that for detecting SNPs that are associated with 0.1%-2% of 10,000 traits, among the 8 methods we consider ADELLE is clearly the most powerful overall, with either the highest power or power not significantly different from the highest for all settings in that range. We apply ADELLE to a mouse advanced intercross line data set and show its ability to find trans-eQTLs that were not significant under a standard analysis. We also apply ADELLE to trans-eQTL mapping in the eQTLGen data, and for 1,451 previously identified trans-eQTLs, we discover trans association with additional expression traits beyond those previously identified. This demonstrates that ADELLE is a powerful tool at uncovering trans regulators of genetic expression.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011563"},"PeriodicalIF":4.0,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11756770/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142962532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-09eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011507
Sara Formichetti, Agnieszka Sadowska, Michela Ascolani, Julia Hansen, Kerstin Ganter, Christophe Lancrin, Neil Humphreys, Mathieu Boulard
{"title":"Genetic gradual reduction of OGT activity unveils the essential role of O-GlcNAc in the mouse embryo.","authors":"Sara Formichetti, Agnieszka Sadowska, Michela Ascolani, Julia Hansen, Kerstin Ganter, Christophe Lancrin, Neil Humphreys, Mathieu Boulard","doi":"10.1371/journal.pgen.1011507","DOIUrl":"10.1371/journal.pgen.1011507","url":null,"abstract":"<p><p>The reversible glycosylation of nuclear and cytoplasmic proteins (O-GlcNAcylation) is catalyzed by a single enzyme, namely O-GlcNAc transferase (OGT). The mammalian Ogt gene is X-linked, and it is essential for embryonic development and for the viability of proliferating cells. We perturbed OGT's function in vivo by creating a murine allelic series of four single amino acid substitutions, reducing OGT's catalytic activity to a range of degrees. The severity of the embryonic lethality was proportional to the extent of impairment of OGT's catalysis, demonstrating that the O-GlcNAc modification itself is required for early development. We identified hypomorphic Ogt alleles that perturb O-GlcNAc homeostasis while being compatible with embryogenesis. The analysis of the transcriptomes of the mutant embryos at different developmental stages suggested a sexually-dimorphic developmental delay caused by the decrease in O-GlcNAc. Furthermore, a mild reduction of OGT's enzymatic activity was sufficient to loosen the silencing of endogenous retroviruses in vivo.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011507"},"PeriodicalIF":4.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11717234/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142957348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-09eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011480
Christian Benner, Anubha Mahajan, Matti Pirinen
{"title":"Refining fine-mapping: Effect sizes and regional heritability.","authors":"Christian Benner, Anubha Mahajan, Matti Pirinen","doi":"10.1371/journal.pgen.1011480","DOIUrl":"10.1371/journal.pgen.1011480","url":null,"abstract":"<p><p>Recent statistical approaches have shown that the set of all available genetic variants explains considerably more phenotypic variance of complex traits and diseases than the individual variants that are robustly associated with these phenotypes. However, rapidly increasing sample sizes constantly improve detection and prioritization of individual variants driving the associations between genomic regions and phenotypes. Therefore, it is useful to routinely estimate how much phenotypic variance the detected variants explain for each region by taking into account the correlation structure of variants and the uncertainty in their causal status. Here we extend the FINEMAP software to estimate the effect sizes and regional heritability under the probabilistic model that assumes a handful of causal variants per region. Using the UK Biobank (UKB) data to simulate genomic regions, we demonstrate that FINEMAP provides higher precision and enables more detailed decomposition of regional heritability into individual variants than the variance component model implemented in BOLT or the fixed-effect model implemented in HESS, particularly when there are only a few causal variants in the fine-mapped region. Using data from 2,940 plasma proteins from the UKB study, we observed that on average FINEMAP identified 2.5 causal variants at an association signal and captured 36% more regional heritability than the variant with the lowest P-value. We estimate that in genomic regions with notable contribution to the total heritability, FINEMAP captures on average 13% and 40% more heritability than BOLT and HESS respectively. Our analysis shows how FINEMAP, BOLT and HESS relate to each other in cases where inference of a variant-level picture of the regional genetic architecture is possible.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011480"},"PeriodicalIF":4.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11753704/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142957263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-08eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011537
Zhendong Huang, Jerome Kelleher, Yao-Ban Chan, David Balding
{"title":"Estimating evolutionary and demographic parameters via ARG-derived IBD.","authors":"Zhendong Huang, Jerome Kelleher, Yao-Ban Chan, David Balding","doi":"10.1371/journal.pgen.1011537","DOIUrl":"10.1371/journal.pgen.1011537","url":null,"abstract":"<p><p>Inference of evolutionary and demographic parameters from a sample of genome sequences often proceeds by first inferring identical-by-descent (IBD) genome segments. By exploiting efficient data encoding based on the ancestral recombination graph (ARG), we obtain three major advantages over current approaches: (i) no need to impose a length threshold on IBD segments, (ii) IBD can be defined without the hard-to-verify requirement of no recombination, and (iii) computation time can be reduced with little loss of statistical efficiency using only the IBD segments from a set of sequence pairs that scales linearly with sample size. We first demonstrate powerful inferences when true IBD information is available from simulated data. For IBD inferred from real data, we propose an approximate Bayesian computation inference algorithm and use it to show that even poorly-inferred short IBD segments can improve estimation. Our mutation-rate estimator achieves precision similar to a previously-published method despite a 4 000-fold reduction in data used for inference, and we identify significant differences between human populations. Computational cost limits model complexity in our approach, but we are able to incorporate unknown nuisance parameters and model misspecification, still finding improved parameter inference.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011537"},"PeriodicalIF":4.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11750106/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142957347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-08eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011545
Zuo Wang, Shuang Wang, Yi Bi, Alessandra Boiti, Shengxiang Zhang, Daniela Vallone, Xianyong Lan, Nicholas S Foulkes, Haiyu Zhao
{"title":"Light-regulated microRNAs shape dynamic gene expression in the zebrafish circadian clock.","authors":"Zuo Wang, Shuang Wang, Yi Bi, Alessandra Boiti, Shengxiang Zhang, Daniela Vallone, Xianyong Lan, Nicholas S Foulkes, Haiyu Zhao","doi":"10.1371/journal.pgen.1011545","DOIUrl":"10.1371/journal.pgen.1011545","url":null,"abstract":"<p><p>A key property of the circadian clock is that it is reset by light to remain synchronized with the day-night cycle. An attractive model to explore light input to the circadian clock in vertebrates is the zebrafish. Circadian clocks in zebrafish peripheral tissues and even zebrafish-derived cell lines are entrainable by direct light exposure thus providing unique insight into the function and evolution of light regulatory pathways. Our previous work has revealed that light-induced gene transcription is a key step in the entrainment of the circadian clock as well as enabling the more general adaptation of zebrafish cells to sunlight exposure. However, considerable evidence points to post-transcriptional regulatory mechanisms, notably microRNAs (miRNAs), playing an essential role in shaping dynamic changes in mRNA levels. Therefore, does light directly impact the function of miRNAs? Are there light-regulated miRNAs and if so, which classes of mRNA do they target? To address these questions, we performed a complete sequencing analysis of light-induced changes in the zebrafish transcriptome, encompassing small non-coding RNAs as well as mRNAs. Importantly, we identified sets of light-regulated miRNAs, with many regulatory targets representing light-inducible mRNAs including circadian clock genes and genes involved in redox homeostasis. We subsequently focused on the light-responsive miR-204-3-3p and miR-430a-3p which are predicted to regulate the expression of cryptochrome genes (cry1a and cry1b). Luciferase reporter assays validated the target binding of miR-204-3-3p and miR-430a-3p to the 3'UTRs of cry1a and cry1b, respectively. Furthermore, treatment with mimics and inhibitors of these two miRNAs significantly affected the dynamic expression of their target genes but also other core clock components (clock1a, bmal1b, per1b, per2, per3), as well as the rhythmic locomotor activity of zebrafish larvae. Thus, our identification of light-responsive miRNAs reveals new intricacy in the multi-level regulation of the circadian clockwork by light.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011545"},"PeriodicalIF":4.0,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11750094/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142957352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-07eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011519
Deborah Kunkel, Peter Sørensen, Vijay Shankar, Fabio Morgante
{"title":"Improving polygenic prediction from summary data by learning patterns of effect sharing across multiple phenotypes.","authors":"Deborah Kunkel, Peter Sørensen, Vijay Shankar, Fabio Morgante","doi":"10.1371/journal.pgen.1011519","DOIUrl":"10.1371/journal.pgen.1011519","url":null,"abstract":"<p><p>Polygenic prediction of complex trait phenotypes has become important in human genetics, especially in the context of precision medicine. Recently, mr.mash, a flexible and computationally efficient method that models multiple phenotypes jointly and leverages sharing of effects across such phenotypes to improve prediction accuracy, was introduced. However, a drawback of mr.mash is that it requires individual-level data, which are often not publicly available. In this work, we introduce mr.mash-rss, an extension of the mr.mash model that requires only summary statistics from Genome-Wide Association Studies (GWAS) and linkage disequilibrium (LD) estimates from a reference panel. By using summary data, we achieve the twin goal of increasing the applicability of the mr.mash model to data sets that are not publicly available and making it scalable to biobank-size data. Through simulations, we show that mr.mash-rss is competitive with, and often outperforms, current state-of-the-art methods for single- and multi-phenotype polygenic prediction in a variety of scenarios that differ in the pattern of effect sharing across phenotypes, the number of phenotypes, the number of causal variants, and the genomic heritability. We also present a real data analysis of 16 blood cell phenotypes in the UK Biobank, showing that mr.mash-rss achieves higher prediction accuracy than competing methods for the majority of traits, especially when the data set has smaller sample size.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011519"},"PeriodicalIF":4.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11741642/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142957350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PLoS GeneticsPub Date : 2025-01-06eCollection Date: 2025-01-01DOI: 10.1371/journal.pgen.1011540
Bushra Haque, David Cheerie, Amy Pan, Meredith Curtis, Thomas Nalpathamkalam, Jimmy Nguyen, Celine Salhab, Bhooma Thiruvahindrapuram, Jade Zhang, Madeline Couse, Taila Hartley, Michelle M Morrow, E Magda Price, Susan Walker, David Malkin, Frederick P Roth, Gregory Costain
{"title":"Leveraging cancer mutation data to inform the pathogenicity classification of germline missense variants.","authors":"Bushra Haque, David Cheerie, Amy Pan, Meredith Curtis, Thomas Nalpathamkalam, Jimmy Nguyen, Celine Salhab, Bhooma Thiruvahindrapuram, Jade Zhang, Madeline Couse, Taila Hartley, Michelle M Morrow, E Magda Price, Susan Walker, David Malkin, Frederick P Roth, Gregory Costain","doi":"10.1371/journal.pgen.1011540","DOIUrl":"https://doi.org/10.1371/journal.pgen.1011540","url":null,"abstract":"<p><p>Innovative and easy-to-implement strategies are needed to improve the pathogenicity assessment of rare germline missense variants. Somatic cancer driver mutations identified through large-scale tumor sequencing studies often impact genes that are also associated with rare Mendelian disorders. The use of cancer mutation data to aid in the interpretation of germline missense variants, regardless of whether the gene is associated with a hereditary cancer predisposition syndrome or a non-cancer-related developmental disorder, has not been systematically assessed. We extracted putative cancer driver missense mutations from the Cancer Hotspots database and annotated them as germline variants, including presence/absence and classification in ClinVar. We trained two supervised learning models (logistic regression and random forest) to predict variant classifications of germline missense variants in ClinVar using Cancer Hotspot data (training dataset). The performance of each model was evaluated with an independent test dataset generated in part from searching public and private genome-wide sequencing datasets from ~1.5 million individuals. Of the 2,447 cancer mutations, 691 corresponding germline variants had been previously classified in ClinVar: 426 (61.6%) as likely pathogenic/pathogenic, 261 (37.8%) as uncertain significance, and 4 (0.6%) as likely benign/benign. The odds ratio for a likely pathogenic/pathogenic classification in ClinVar was 28.3 (95% confidence interval: 24.2-33.1, p < 0.001), compared with all other germline missense variants in the same 216 genes. Both supervised learning models showed high correlation with pathogenicity assessments in the training dataset. There was high area under precision-recall curve values (0.847 and 0.829) and area under the receiver-operating characteristic curve values (0.821 and 0.774) for logistic regression and random forest models, respectively, when applied to the test dataset. With the use of cancer and germline datasets and supervised learning techniques, our study shows that cancer mutation data can be leveraged to improve the interpretation of germline missense variation potentially causing rare Mendelian disorders.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011540"},"PeriodicalIF":4.0,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11737861/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}