Defining structural and evolutionary modules in proteins: a community detection approach to explore sub-domain architecture

IF 2.222 Q3 Biochemistry, Genetics and Molecular Biology
Jose Sergio Hleap, Edward Susko, Christian Blouin
{"title":"Defining structural and evolutionary modules in proteins: a community detection approach to explore sub-domain architecture","authors":"Jose Sergio Hleap,&nbsp;Edward Susko,&nbsp;Christian Blouin","doi":"10.1186/1472-6807-13-20","DOIUrl":null,"url":null,"abstract":"<p>Assessing protein modularity is important to understand protein evolution. Still the question of the existence of a sub-domain modular architecture remains. We propose a graph-theory approach with significance and power testing to identify modules in protein structures. In the first step, clusters are determined by optimizing the partition that maximizes the modularity score. Second, each cluster is tested for significance. Significant clusters are referred to as modules. Evolutionary modules are identified by analyzing homologous structures. Dynamic modules are inferred from sets of snapshots of molecular simulations. We present here a methodology to identify sub-domain architecture robustly, biologically meaningful, and statistically supported.</p><p>The robustness of this new method is tested using simulated data with known modularity. Modules are correctly identified even when there is a low correlation between landmarks within a module. We also analyzed the evolutionary modularity of a data set of <i>α</i>-amylase catalytic domain homologs, and the dynamic modularity of the Niemann-Pick C1 (NPC1) protein N-terminal domain.</p><p>The <i>α</i>-amylase contains an (<i>α</i>/<i>β</i>)<sub>8</sub> barrel (TIM barrel) with the polysaccharides cleavage site and a calcium-binding domain. In this data set we identified four robust evolutionary modules, one of which forms the minimal functional TIM barrel topology.</p><p>The NPC1 protein is involved in the intracellular lipid metabolism coordinating sterol trafficking. NPC1 N-terminus is the first luminal domain which binds to cholesterol and its oxygenated derivatives. Our inferred dynamic modules in the protein NPC1 are also shown to match functional components of the protein related to the NPC1 disease.</p><p>A domain compartmentalization can be found and described in correlation space. To our knowledge, there is no other method attempting to identify sub-domain architecture from the correlation among residues. Most attempts made focus on sequence motifs of protein-protein interactions, binding sites, or sequence conservancy. We were able to describe functional/structural sub-domain architecture related to key residues for starch cleavage, calcium, and chloride binding sites in the <i>α</i>-amylase, and sterol opening-defining modules and disease-related residues in the NPC1. We also described the evolutionary sub-domain architecture of the <i>α</i>-amylase catalytic domain, identifying the already reported minimum functional TIM barrel.</p>","PeriodicalId":498,"journal":{"name":"BMC Structural Biology","volume":"13 1","pages":""},"PeriodicalIF":2.2220,"publicationDate":"2013-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1472-6807-13-20","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Structural Biology","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1186/1472-6807-13-20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 16

Abstract

Assessing protein modularity is important to understand protein evolution. Still the question of the existence of a sub-domain modular architecture remains. We propose a graph-theory approach with significance and power testing to identify modules in protein structures. In the first step, clusters are determined by optimizing the partition that maximizes the modularity score. Second, each cluster is tested for significance. Significant clusters are referred to as modules. Evolutionary modules are identified by analyzing homologous structures. Dynamic modules are inferred from sets of snapshots of molecular simulations. We present here a methodology to identify sub-domain architecture robustly, biologically meaningful, and statistically supported.

The robustness of this new method is tested using simulated data with known modularity. Modules are correctly identified even when there is a low correlation between landmarks within a module. We also analyzed the evolutionary modularity of a data set of α-amylase catalytic domain homologs, and the dynamic modularity of the Niemann-Pick C1 (NPC1) protein N-terminal domain.

The α-amylase contains an (α/β)8 barrel (TIM barrel) with the polysaccharides cleavage site and a calcium-binding domain. In this data set we identified four robust evolutionary modules, one of which forms the minimal functional TIM barrel topology.

The NPC1 protein is involved in the intracellular lipid metabolism coordinating sterol trafficking. NPC1 N-terminus is the first luminal domain which binds to cholesterol and its oxygenated derivatives. Our inferred dynamic modules in the protein NPC1 are also shown to match functional components of the protein related to the NPC1 disease.

A domain compartmentalization can be found and described in correlation space. To our knowledge, there is no other method attempting to identify sub-domain architecture from the correlation among residues. Most attempts made focus on sequence motifs of protein-protein interactions, binding sites, or sequence conservancy. We were able to describe functional/structural sub-domain architecture related to key residues for starch cleavage, calcium, and chloride binding sites in the α-amylase, and sterol opening-defining modules and disease-related residues in the NPC1. We also described the evolutionary sub-domain architecture of the α-amylase catalytic domain, identifying the already reported minimum functional TIM barrel.

Abstract Image

定义蛋白质的结构和进化模块:探索子结构域结构的群落检测方法
评估蛋白质的模块化对理解蛋白质的进化是很重要的。子领域模块化体系结构存在的问题仍然存在。我们提出了一种具有显著性和功率检验的图论方法来识别蛋白质结构中的模块。在第一步中,通过优化最大模块化分数的分区来确定集群。其次,对每个聚类进行显著性检验。重要的集群称为模块。通过分析同源结构来识别进化模块。动态模块是从分子模拟的快照集中推断出来的。我们在这里提出了一种方法来识别子领域架构健壮,生物学上有意义,和统计支持。用已知模块性的模拟数据验证了该方法的鲁棒性。即使模块内的地标之间相关性较低,也能正确识别模块。我们还分析了α-淀粉酶催化结构域同源物数据集的进化模块性,以及Niemann-Pick C1 (NPC1)蛋白n端结构域的动态模块性。α-淀粉酶含有一个(α/β)8桶(TIM桶),具有多糖裂解位点和一个钙结合结构域。在该数据集中,我们确定了四个健壮的进化模块,其中一个模块形成了最小功能TIM桶拓扑。NPC1蛋白参与细胞内脂质代谢,协调固醇运输。NPC1 n端是第一个与胆固醇及其氧衍生物结合的管腔结构域。我们在蛋白质NPC1中推断的动态模块也被证明与NPC1疾病相关的蛋白质的功能成分相匹配。领域划分可以在相关空间中找到并描述。据我们所知,没有其他方法试图从残基之间的相关性中识别子域结构。大多数尝试集中在蛋白质相互作用、结合位点或序列保护的序列基序上。我们能够描述与α-淀粉酶中淀粉裂解、钙和氯结合位点的关键残基以及NPC1中甾醇开放定义模块和疾病相关残基相关的功能/结构子域结构。我们还描述了α-淀粉酶催化结构域的进化子结构域结构,确定了已经报道的最小功能TIM桶。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Structural Biology
BMC Structural Biology 生物-生物物理
CiteScore
3.60
自引率
0.00%
发文量
0
期刊介绍: BMC Structural Biology is an open access, peer-reviewed journal that considers articles on investigations into the structure of biological macromolecules, including solving structures, structural and functional analyses, and computational modeling.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信