Cofitness network connectivity determines a fuzzy essential zone in open bacterial pangenome.

IF 4.5 Q1 MICROBIOLOGY
mLife Pub Date : 2024-06-28 eCollection Date: 2024-06-01 DOI:10.1002/mlf2.12132
Pan Zhang, Biliang Zhang, Yuan-Yuan Ji, Jian Jiao, Ziding Zhang, Chang-Fu Tian
{"title":"Cofitness network connectivity determines a fuzzy essential zone in open bacterial pangenome.","authors":"Pan Zhang, Biliang Zhang, Yuan-Yuan Ji, Jian Jiao, Ziding Zhang, Chang-Fu Tian","doi":"10.1002/mlf2.12132","DOIUrl":null,"url":null,"abstract":"<p><p>Most in silico evolutionary studies commonly assumed that core genes are essential for cellular function, while accessory genes are dispensable, particularly in nutrient-rich environments. However, this assumption is seldom tested genetically within the pangenome context. In this study, we conducted a robust pangenomic Tn-seq analysis of fitness genes in a nutrient-rich medium for <i>Sinorhizobium</i> strains with a canonical open pangenome. To evaluate the robustness of fitness category assignment, Tn-seq data for three independent mutant libraries per strain were analyzed by three methods, which indicates that the Hidden Markov Model (HMM)-based method is most robust to variations between mutant libraries and not sensitive to data size, outperforming the Bayesian and Monte Carlo simulation-based methods. Consequently, the HMM method was used to classify the fitness category. Fitness genes, categorized as essential (ES), advantage (GA), and disadvantage (GD) genes for growth, are enriched in core genes, while nonessential genes (NE) are over-represented in accessory genes. Accessory ES/GA genes showed a lower fitness effect than core ES/GA genes. Connectivity degrees in the cofitness network decrease in the order of ES, GD, and GA/NE. In addition to accessory genes, 1599 out of 3284 core genes display differential essentiality across test strains. Within the pangenome core, both shared quasi-essential (ES and GA) and strain-dependent fitness genes are enriched in similar functional categories. Our analysis demonstrates a considerable fuzzy essential zone determined by cofitness connectivity degrees in <i>Sinorhizobium</i> pangenome and highlights the power of the cofitness network in understanding the genetic basis of ever-increasing prokaryotic pangenome data.</p>","PeriodicalId":94145,"journal":{"name":"mLife","volume":"3 2","pages":"277-290"},"PeriodicalIF":4.5000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11211677/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"mLife","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mlf2.12132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Most in silico evolutionary studies commonly assumed that core genes are essential for cellular function, while accessory genes are dispensable, particularly in nutrient-rich environments. However, this assumption is seldom tested genetically within the pangenome context. In this study, we conducted a robust pangenomic Tn-seq analysis of fitness genes in a nutrient-rich medium for Sinorhizobium strains with a canonical open pangenome. To evaluate the robustness of fitness category assignment, Tn-seq data for three independent mutant libraries per strain were analyzed by three methods, which indicates that the Hidden Markov Model (HMM)-based method is most robust to variations between mutant libraries and not sensitive to data size, outperforming the Bayesian and Monte Carlo simulation-based methods. Consequently, the HMM method was used to classify the fitness category. Fitness genes, categorized as essential (ES), advantage (GA), and disadvantage (GD) genes for growth, are enriched in core genes, while nonessential genes (NE) are over-represented in accessory genes. Accessory ES/GA genes showed a lower fitness effect than core ES/GA genes. Connectivity degrees in the cofitness network decrease in the order of ES, GD, and GA/NE. In addition to accessory genes, 1599 out of 3284 core genes display differential essentiality across test strains. Within the pangenome core, both shared quasi-essential (ES and GA) and strain-dependent fitness genes are enriched in similar functional categories. Our analysis demonstrates a considerable fuzzy essential zone determined by cofitness connectivity degrees in Sinorhizobium pangenome and highlights the power of the cofitness network in understanding the genetic basis of ever-increasing prokaryotic pangenome data.

协同网络连通性决定了开放细菌泛基因组的模糊基本区。
大多数硅学进化研究通常假定,核心基因对细胞功能至关重要,而附属基因则可有可无,尤其是在营养丰富的环境中。然而,这种假设很少在泛基因组背景下进行基因测试。在本研究中,我们对营养丰富的培养基中具有典型开放庞基因组的肉毒杆菌菌株进行了稳健的庞基因组 Tn-seq 分析。结果表明,基于隐马尔可夫模型(HMM)的方法对突变库之间的变化最为稳健,而且对数据量不敏感,优于基于贝叶斯和蒙特卡罗模拟的方法。因此,我们使用 HMM 方法来划分适合度类别。适合度基因分为生长必需基因(ES)、优势基因(GA)和劣势基因(GD),它们在核心基因中富集,而非必需基因(NE)在附属基因中比例过高。与核心 ES/GA 基因相比,附属 ES/GA 基因表现出较低的适应性效应。共效网络中的连接度按照ES、GD和GA/NE的顺序递减。除附属基因外,3284个核心基因中有1599个基因在不同的测试品系中显示出不同的重要程度。在泛基因组核心中,共享的准基本基因(ES 和 GA)和依赖于品系的适生性基因都富集在类似的功能类别中。我们的分析表明,在根瘤菌庞基因组中,共适性连接度决定了一个相当大的模糊必要区,并突出了共适性网络在理解不断增加的原核生物庞基因组数据的遗传基础方面的作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信