Alexander M Ille, Emily Anas, Michael B Mathews, Stephen K Burley
{"title":"From sequence to protein structure and conformational dynamics with artificial intelligence/machine learning.","authors":"Alexander M Ille, Emily Anas, Michael B Mathews, Stephen K Burley","doi":"10.1063/4.0000765","DOIUrl":null,"url":null,"abstract":"<p><p>The 2024 Nobel Prize in Chemistry was awarded in part for <i>de novo</i> protein structure prediction using AlphaFold2, an artificial intelligence/machine learning (AI/ML) model trained on vast amounts of sequence and three-dimensional structure data. AlphaFold2 and related models, including RoseTTAFold and ESMFold, employ specialized neural network architectures driven by attention mechanisms to infer relationships between sequence and structure. At a fundamental level, these AI/ML models operate on the long-standing hypothesis that the structure of a protein is determined by its amino acid sequence. More recently, AlphaFold2 has been adapted for the prediction of multiple protein conformations by subsampling multiple sequence alignments. Herein, we provide an overview of the deterministic relationship between sequence and structure, which was hypothesized over half a century ago with profound implications for the biological sciences ever since. We postulate that protein conformational dynamics are also determined, at least in part, by amino acid sequence and that this relationship may be leveraged for construction of AI/ML models dedicated to predicting protein conformational ensembles. Accordingly, we describe a conceptual model architecture, which may be trained on sequence data in combination with conformationally sensitive structural information, coming primarily from nuclear magnetic resonance (NMR) spectroscopy. Notwithstanding certain limitations in this context, NMR offers abundant structural heterogeneity conducive to conformational ensemble prediction. As NMR and other data continue to accumulate, sequence-informed prediction of protein structural dynamics with AI/ML has the potential to emerge as a transformative capability across the biological sciences.</p>","PeriodicalId":48683,"journal":{"name":"Structural Dynamics-Us","volume":"12 3","pages":"030902"},"PeriodicalIF":2.3000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12195464/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Structural Dynamics-Us","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1063/4.0000765","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
The 2024 Nobel Prize in Chemistry was awarded in part for de novo protein structure prediction using AlphaFold2, an artificial intelligence/machine learning (AI/ML) model trained on vast amounts of sequence and three-dimensional structure data. AlphaFold2 and related models, including RoseTTAFold and ESMFold, employ specialized neural network architectures driven by attention mechanisms to infer relationships between sequence and structure. At a fundamental level, these AI/ML models operate on the long-standing hypothesis that the structure of a protein is determined by its amino acid sequence. More recently, AlphaFold2 has been adapted for the prediction of multiple protein conformations by subsampling multiple sequence alignments. Herein, we provide an overview of the deterministic relationship between sequence and structure, which was hypothesized over half a century ago with profound implications for the biological sciences ever since. We postulate that protein conformational dynamics are also determined, at least in part, by amino acid sequence and that this relationship may be leveraged for construction of AI/ML models dedicated to predicting protein conformational ensembles. Accordingly, we describe a conceptual model architecture, which may be trained on sequence data in combination with conformationally sensitive structural information, coming primarily from nuclear magnetic resonance (NMR) spectroscopy. Notwithstanding certain limitations in this context, NMR offers abundant structural heterogeneity conducive to conformational ensemble prediction. As NMR and other data continue to accumulate, sequence-informed prediction of protein structural dynamics with AI/ML has the potential to emerge as a transformative capability across the biological sciences.
Structural Dynamics-UsCHEMISTRY, PHYSICALPHYSICS, ATOMIC, MOLECU-PHYSICS, ATOMIC, MOLECULAR & CHEMICAL
CiteScore
5.50
自引率
3.60%
发文量
24
审稿时长
16 weeks
期刊介绍:
Structural Dynamics focuses on the recent developments in experimental and theoretical methods and techniques that allow a visualization of the electronic and geometric structural changes in real time of chemical, biological, and condensed-matter systems. The community of scientists and engineers working on structural dynamics in such diverse systems often use similar instrumentation and methods.
The journal welcomes articles dealing with fundamental problems of electronic and structural dynamics that are tackled by new methods, such as:
Time-resolved X-ray and electron diffraction and scattering,
Coherent diffractive imaging,
Time-resolved X-ray spectroscopies (absorption, emission, resonant inelastic scattering, etc.),
Time-resolved electron energy loss spectroscopy (EELS) and electron microscopy,
Time-resolved photoelectron spectroscopies (UPS, XPS, ARPES, etc.),
Multidimensional spectroscopies in the infrared, the visible and the ultraviolet,
Nonlinear spectroscopies in the VUV, the soft and the hard X-ray domains,
Theory and computational methods and algorithms for the analysis and description of structuraldynamics and their associated experimental signals.
These new methods are enabled by new instrumentation, such as:
X-ray free electron lasers, which provide flux, coherence, and time resolution,
New sources of ultrashort electron pulses,
New sources of ultrashort vacuum ultraviolet (VUV) to hard X-ray pulses, such as high-harmonic generation (HHG) sources or plasma-based sources,
New sources of ultrashort infrared and terahertz (THz) radiation,
New detectors for X-rays and electrons,
New sample handling and delivery schemes,
New computational capabilities.