Inferring fungal cis-regulatory networks from genome sequences via unsupervised and interpretable representation learning.

IF 5.1 3区 生物学 Q2 GENETICS & HEREDITY
Genetics Pub Date : 2025-09-30 DOI:10.1093/genetics/iyaf209
Alan M Moses, Jason E Stajich, Audrey P Gasch, David A Knowles
{"title":"Inferring fungal cis-regulatory networks from genome sequences via unsupervised and interpretable representation learning.","authors":"Alan M Moses, Jason E Stajich, Audrey P Gasch, David A Knowles","doi":"10.1093/genetics/iyaf209","DOIUrl":null,"url":null,"abstract":"<p><p>Gene expression patterns are determined to a large extent by transcription factor binding to non-coding regulatory regions in the genome. However, gene expression cannot yet be systematically predicted from genome sequences, in part because non-functional matches to the sequence patterns (motifs) recognized by transcription factors (TFs) occur frequently throughout the genome. Large-scale functional genomics data for many TFs has enabled characterization of regulatory networks in experimentally accessible cells such as budding yeast. Beyond yeast, fungi are important industrial organisms and pathogens, but large-scale functional data is only sporadically available. Uncharacterized regulatory networks control key pathways and gene expression programs associated with fungal phenotypes. Here we explore a sequence-only approach to inferring regulatory networks by leveraging the 100s of genomes now available for many clades of fungi. We use gene orthology as the learning signal to infer interpretable, TF motif-based representations of non-coding regulatory regions. Using these representations to identify conserved signals for motifs, comparative genomics can be scaled to evolutionary comparisons where sequence similarity cannot be detected. We show that similarity of these conserved motif signals predicts gene expression and regulation better than using experimental data, and that we can infer known and novel regulatory connections in diverse fungi. Our new predictions include a pathway for recombination in C. albicans and pathways for mating and an RNAi immune response in Neurospora. Taken together, our results indicate that specific hypotheses about transcriptional regulation in fungi can be obtained for many genes from genome sequence analysis alone.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/genetics/iyaf209","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Gene expression patterns are determined to a large extent by transcription factor binding to non-coding regulatory regions in the genome. However, gene expression cannot yet be systematically predicted from genome sequences, in part because non-functional matches to the sequence patterns (motifs) recognized by transcription factors (TFs) occur frequently throughout the genome. Large-scale functional genomics data for many TFs has enabled characterization of regulatory networks in experimentally accessible cells such as budding yeast. Beyond yeast, fungi are important industrial organisms and pathogens, but large-scale functional data is only sporadically available. Uncharacterized regulatory networks control key pathways and gene expression programs associated with fungal phenotypes. Here we explore a sequence-only approach to inferring regulatory networks by leveraging the 100s of genomes now available for many clades of fungi. We use gene orthology as the learning signal to infer interpretable, TF motif-based representations of non-coding regulatory regions. Using these representations to identify conserved signals for motifs, comparative genomics can be scaled to evolutionary comparisons where sequence similarity cannot be detected. We show that similarity of these conserved motif signals predicts gene expression and regulation better than using experimental data, and that we can infer known and novel regulatory connections in diverse fungi. Our new predictions include a pathway for recombination in C. albicans and pathways for mating and an RNAi immune response in Neurospora. Taken together, our results indicate that specific hypotheses about transcriptional regulation in fungi can be obtained for many genes from genome sequence analysis alone.

通过无监督和可解释的表征学习从基因组序列推断真菌顺式调控网络。
基因表达模式在很大程度上取决于转录因子与基因组中非编码调控区域的结合。然而,基因表达还不能从基因组序列中系统地预测,部分原因是与转录因子(TFs)识别的序列模式(基序)的非功能性匹配在整个基因组中经常发生。许多tf的大规模功能基因组学数据使实验可获得的细胞(如出芽酵母)的调节网络特性成为可能。除了酵母,真菌也是重要的工业生物和病原体,但大规模的功能数据只是零星的。未表征的调控网络控制与真菌表型相关的关键途径和基因表达程序。在这里,我们探索了一种仅限序列的方法,通过利用目前可用于许多真菌分支的100个基因组来推断调控网络。我们使用基因同源学作为学习信号来推断非编码调控区域可解释的、基于TF基序的表征。使用这些表征来识别基序的保守信号,比较基因组学可以扩展到无法检测序列相似性的进化比较。我们发现这些保守基序信号的相似性比使用实验数据更好地预测基因表达和调控,并且我们可以推断出不同真菌中已知的和新的调控联系。我们的新预测包括白色念珠菌的重组途径和神经孢子菌的交配途径和RNAi免疫反应。综上所述,我们的研究结果表明,仅通过基因组序列分析就可以获得真菌中许多基因转录调控的特定假设。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Genetics
Genetics GENETICS & HEREDITY-
CiteScore
6.90
自引率
6.10%
发文量
177
审稿时长
1.5 months
期刊介绍: GENETICS is published by the Genetics Society of America, a scholarly society that seeks to deepen our understanding of the living world by advancing our understanding of genetics. Since 1916, GENETICS has published high-quality, original research presenting novel findings bearing on genetics and genomics. The journal publishes empirical studies of organisms ranging from microbes to humans, as well as theoretical work. While it has an illustrious history, GENETICS has changed along with the communities it serves: it is not your mentor''s journal. The editors make decisions quickly – in around 30 days – without sacrificing the excellence and scholarship for which the journal has long been known. GENETICS is a peer reviewed, peer-edited journal, with an international reach and increasing visibility and impact. All editorial decisions are made through collaboration of at least two editors who are practicing scientists. GENETICS is constantly innovating: expanded types of content include Reviews, Commentary (current issues of interest to geneticists), Perspectives (historical), Primers (to introduce primary literature into the classroom), Toolbox Reviews, plus YeastBook, FlyBook, and WormBook (coming spring 2016). For particularly time-sensitive results, we publish Communications. As part of our mission to serve our communities, we''ve published thematic collections, including Genomic Selection, Multiparental Populations, Mouse Collaborative Cross, and the Genetics of Sex.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信