Raktim Mitra, Jinsen Li, Jared M. Sagendorf, Yibei Jiang, Ari S. Cohen, Tsu-Pei Chiu, Cameron J. Glasscock, Remo Rohs
{"title":"Geometric deep learning of protein–DNA binding specificity","authors":"Raktim Mitra, Jinsen Li, Jared M. Sagendorf, Yibei Jiang, Ari S. Cohen, Tsu-Pei Chiu, Cameron J. Glasscock, Remo Rohs","doi":"10.1038/s41592-024-02372-w","DOIUrl":null,"url":null,"abstract":"Predicting protein–DNA binding specificity is a challenging yet essential task for understanding gene regulation. Protein–DNA complexes usually exhibit binding to a selected DNA target site, whereas a protein binds, with varying degrees of binding specificity, to a wide range of DNA sequences. This information is not directly accessible in a single structure. Here, to access this information, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity from protein–DNA structure. DeepPBS can be applied to experimental or predicted structures. Interpretable protein heavy atom importance scores for interface residues can be extracted. When aggregated at the protein residue level, these scores are validated through mutagenesis experiments. Applied to designed proteins targeting specific DNA sequences, DeepPBS was demonstrated to predict experimentally measured binding specificity. DeepPBS offers a foundation for machine-aided studies that advance our understanding of molecular interactions and guide experimental designs and synthetic biology. DeepPBS is a deep-learning model designed to predict the binding specificity of protein–DNA interactions using physicochemical and geometric contexts. DeepPBS functions across protein families and on experimentally determined as well as predicted protein–DNA complex structures.","PeriodicalId":18981,"journal":{"name":"Nature Methods","volume":null,"pages":null},"PeriodicalIF":36.1000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41592-024-02372-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Methods","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41592-024-02372-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Predicting protein–DNA binding specificity is a challenging yet essential task for understanding gene regulation. Protein–DNA complexes usually exhibit binding to a selected DNA target site, whereas a protein binds, with varying degrees of binding specificity, to a wide range of DNA sequences. This information is not directly accessible in a single structure. Here, to access this information, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity from protein–DNA structure. DeepPBS can be applied to experimental or predicted structures. Interpretable protein heavy atom importance scores for interface residues can be extracted. When aggregated at the protein residue level, these scores are validated through mutagenesis experiments. Applied to designed proteins targeting specific DNA sequences, DeepPBS was demonstrated to predict experimentally measured binding specificity. DeepPBS offers a foundation for machine-aided studies that advance our understanding of molecular interactions and guide experimental designs and synthetic biology. DeepPBS is a deep-learning model designed to predict the binding specificity of protein–DNA interactions using physicochemical and geometric contexts. DeepPBS functions across protein families and on experimentally determined as well as predicted protein–DNA complex structures.
预测蛋白质-DNA 结合的特异性是了解基因调控的一项具有挑战性但又必不可少的任务。蛋白质-DNA 复合物通常表现为与选定的 DNA 目标位点结合,而蛋白质则以不同程度的结合特异性与多种 DNA 序列结合。单个结构无法直接获取这些信息。为了获取这些信息,我们提出了结合特异性深度预测模型(DeepPBS),这是一种几何深度学习模型,旨在从蛋白质-DNA 结构中预测结合特异性。DeepPBS 可应用于实验结构或预测结构。可以为界面残基提取可解释的蛋白质重原子重要性分数。在蛋白质残基水平上汇总后,这些分数可通过诱变实验进行验证。将 DeepPBS 应用于以特定 DNA 序列为靶标的设计蛋白质,证明它可以预测实验测定的结合特异性。DeepPBS 为机器辅助研究奠定了基础,这些研究可促进我们对分子相互作用的理解,并为实验设计和合成生物学提供指导。
期刊介绍:
Nature Methods is a monthly journal that focuses on publishing innovative methods and substantial enhancements to fundamental life sciences research techniques. Geared towards a diverse, interdisciplinary readership of researchers in academia and industry engaged in laboratory work, the journal offers new tools for research and emphasizes the immediate practical significance of the featured work. It publishes primary research papers and reviews recent technical and methodological advancements, with a particular interest in primary methods papers relevant to the biological and biomedical sciences. This includes methods rooted in chemistry with practical applications for studying biological problems.