Yafan Zhang, Irene Silvernail, Zhuyang Lin, Xingcheng Lin
{"title":"Interpretable protein-DNA interactions captured by structure-sequence optimization.","authors":"Yafan Zhang, Irene Silvernail, Zhuyang Lin, Xingcheng Lin","doi":"10.7554/eLife.105565","DOIUrl":null,"url":null,"abstract":"<p><p>Sequence-specific DNA recognition underlies essential processes in gene regulation, yet methods for simultaneous predictions of genomic DNA recognition sites and their binding affinity remain lacking. Here, we present the Interpretable protein-DNA Energy Associative (IDEA) model, a residue-level, interpretable biophysical model capable of predicting binding sites and affinities of DNA-binding proteins. By fusing structures and sequences of known protein-DNA complexes into an optimized energy model, IDEA enables direct interpretation of physicochemical interactions among individual amino acids and nucleotides. We demonstrate that this energy model can accurately predict DNA recognition sites and their binding strengths across various protein families. Additionally, the IDEA model is integrated into a coarse-grained simulation framework that quantitatively captures the absolute protein-DNA binding free energies. Overall, IDEA provides an integrated computational platform that alleviates experimental costs and biases in assessing DNA recognition and can be utilized for mechanistic studies of various DNA-recognition processes.</p>","PeriodicalId":11640,"journal":{"name":"eLife","volume":"14 ","pages":""},"PeriodicalIF":6.4000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"eLife","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.7554/eLife.105565","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Sequence-specific DNA recognition underlies essential processes in gene regulation, yet methods for simultaneous predictions of genomic DNA recognition sites and their binding affinity remain lacking. Here, we present the Interpretable protein-DNA Energy Associative (IDEA) model, a residue-level, interpretable biophysical model capable of predicting binding sites and affinities of DNA-binding proteins. By fusing structures and sequences of known protein-DNA complexes into an optimized energy model, IDEA enables direct interpretation of physicochemical interactions among individual amino acids and nucleotides. We demonstrate that this energy model can accurately predict DNA recognition sites and their binding strengths across various protein families. Additionally, the IDEA model is integrated into a coarse-grained simulation framework that quantitatively captures the absolute protein-DNA binding free energies. Overall, IDEA provides an integrated computational platform that alleviates experimental costs and biases in assessing DNA recognition and can be utilized for mechanistic studies of various DNA-recognition processes.
期刊介绍:
eLife is a distinguished, not-for-profit, peer-reviewed open access scientific journal that specializes in the fields of biomedical and life sciences. eLife is known for its selective publication process, which includes a variety of article types such as:
Research Articles: Detailed reports of original research findings.
Short Reports: Concise presentations of significant findings that do not warrant a full-length research article.
Tools and Resources: Descriptions of new tools, technologies, or resources that facilitate scientific research.
Research Advances: Brief reports on significant scientific advancements that have immediate implications for the field.
Scientific Correspondence: Short communications that comment on or provide additional information related to published articles.
Review Articles: Comprehensive overviews of a specific topic or field within the life sciences.