{"title":"Simpler Protein Domain Identification Using Spectral Clustering.","authors":"Frédéric Cazals, Jules Herrmann, Edoardo Sarti","doi":"10.1002/prot.26808","DOIUrl":null,"url":null,"abstract":"<p><p>The decomposition of a biomolecular complex into domains is an important step to investigate biological functions and ease structure determination. A successful approach to do so is the SPECTRUS algorithm, which provides a segmentation based on spectral clustering applied to a graph coding inter-atomic fluctuations derived from an elastic network model. We present SPECTRALDOM, which makes three straightforward and useful additions to SPECTRUS. For single structures, we show that high quality partitionings can be obtained from a graph Laplacian derived from pairwise interactions-without normal modes. For sets of homologous structures, we introduce a Multiple Sequence Alignment mode, exploiting both the sequence based information (MSA) and the geometric information embodied in experimental structures. Finally, we propose to analyze the clusters/domains delivered using the so-called <math> <semantics><mrow><mi>D</mi></mrow> </semantics> </math> -family-matching algorithm, which establishes a correspondence between domains yielded by two decompositions, and can be used to handle fragmentation issues. Our domains compare favorably to those of the original SPECTRUS, and those of the deep learning based method Chainsaw. Using two complex cases, we show in particular that SPECTRALDOM is the only method handling complex conformational changes involving several sub-domains. Finally, a comparison of SPECTRALDOM and Chainsaw on the manually curated domain classification ECOD as a reference shows that high quality domains are obtained without using any evolutionary related piece of information. SPECTRALDOM is provided in the Structural Bioinformatics Library, see http://sbl.inria.fr and https://sbl.inria.fr/doc/Spectral_domain_explorer-user-manual.html.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"1212-1225"},"PeriodicalIF":3.2000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteins-Structure Function and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/prot.26808","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/13 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The decomposition of a biomolecular complex into domains is an important step to investigate biological functions and ease structure determination. A successful approach to do so is the SPECTRUS algorithm, which provides a segmentation based on spectral clustering applied to a graph coding inter-atomic fluctuations derived from an elastic network model. We present SPECTRALDOM, which makes three straightforward and useful additions to SPECTRUS. For single structures, we show that high quality partitionings can be obtained from a graph Laplacian derived from pairwise interactions-without normal modes. For sets of homologous structures, we introduce a Multiple Sequence Alignment mode, exploiting both the sequence based information (MSA) and the geometric information embodied in experimental structures. Finally, we propose to analyze the clusters/domains delivered using the so-called -family-matching algorithm, which establishes a correspondence between domains yielded by two decompositions, and can be used to handle fragmentation issues. Our domains compare favorably to those of the original SPECTRUS, and those of the deep learning based method Chainsaw. Using two complex cases, we show in particular that SPECTRALDOM is the only method handling complex conformational changes involving several sub-domains. Finally, a comparison of SPECTRALDOM and Chainsaw on the manually curated domain classification ECOD as a reference shows that high quality domains are obtained without using any evolutionary related piece of information. SPECTRALDOM is provided in the Structural Bioinformatics Library, see http://sbl.inria.fr and https://sbl.inria.fr/doc/Spectral_domain_explorer-user-manual.html.
生物分子复合物的结构域分解是研究生物功能和简化结构确定的重要步骤。SPECTRUS算法是一种成功的方法,它提供了基于谱聚类的分割,应用于从弹性网络模型派生的图编码原子间波动。我们提出SPECTRALDOM,它使SPECTRUS的三个简单而有用的补充。对于单个结构,我们证明了从无正态模态的两两相互作用得到的图拉普拉斯算子可以得到高质量的分区。对于同源结构集,我们引入了一种多序列比对模式,利用基于序列的信息(MSA)和实验结构中包含的几何信息。最后,我们建议使用所谓的D $$ D $$ -family匹配算法来分析交付的聚类/域,该算法在两次分解产生的域之间建立对应关系,并可用于处理碎片问题。我们的领域与原始的SPECTRUS和基于深度学习的方法Chainsaw相比具有优势。通过两个复杂的案例,我们特别证明了SPECTRALDOM是处理涉及多个子域的复杂构象变化的唯一方法。最后,将SPECTRALDOM和Chainsaw在人工分类域ECOD上的比较作为参考,表明在不使用任何进化相关信息的情况下获得了高质量的域。SPECTRALDOM在结构生物信息学库中提供,参见http://sbl.inria.fr和https://sbl.inria.fr/doc/Spectral_domain_explorer-user-manual.html。
期刊介绍:
PROTEINS : Structure, Function, and Bioinformatics publishes original reports of significant experimental and analytic research in all areas of protein research: structure, function, computation, genetics, and design. The journal encourages reports that present new experimental or computational approaches for interpreting and understanding data from biophysical chemistry, structural studies of proteins and macromolecular assemblies, alterations of protein structure and function engineered through techniques of molecular biology and genetics, functional analyses under physiologic conditions, as well as the interactions of proteins with receptors, nucleic acids, or other specific ligands or substrates. Research in protein and peptide biochemistry directed toward synthesizing or characterizing molecules that simulate aspects of the activity of proteins, or that act as inhibitors of protein function, is also within the scope of PROTEINS. In addition to full-length reports, short communications (usually not more than 4 printed pages) and prediction reports are welcome. Reviews are typically by invitation; authors are encouraged to submit proposed topics for consideration.