Valentin Lombard, Dan Timsit, Sergei Grudinin, Elodie Laine
{"title":"从蛋白质语言模型到连续结构异质性","authors":"Valentin Lombard, Dan Timsit, Sergei Grudinin, Elodie Laine","doi":"10.1016/j.str.2025.06.010","DOIUrl":null,"url":null,"abstract":"How proteins move and deform determines their interactions with the environment and is thus of the utmost importance for cellular functioning. Following the revolution in single protein 3D structure prediction, researchers have focused on repurposing or developing deep learning models for sampling alternative protein conformations. In this work, we explored whether continuous compact representations of protein motions could be predicted directly from sequences, without exploiting 3D structures. SeaMoon leverages protein language model (pLM) embeddings as input to a lightweight convolutional neural network. We assessed SeaMoon against <span><span style=\"\"><math><mrow is=\"true\"><mo is=\"true\">∼</mo></mrow></math></span><span style=\"font-size: 90%; display: inline-block;\" tabindex=\"0\"></span><script type=\"math/mml\"><math><mrow is=\"true\"><mo is=\"true\">∼</mo></mrow></math></script></span>1,000 collections of experimental conformations exhibiting diverse motions. It predicts at least one ground-truth motion with reasonable accuracy for 40% of the test proteins. SeaMoon captures motions inaccessible to normal mode analysis, an unsupervised physics-based method relying solely on 3D geometry, and generalizes to proteins without detectable sequence similarity to the training set. SeaMoon is easily retrainable with novel or updated pLMs.","PeriodicalId":22168,"journal":{"name":"Structure","volume":"13 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SeaMoon: From protein language models to continuous structural heterogeneity\",\"authors\":\"Valentin Lombard, Dan Timsit, Sergei Grudinin, Elodie Laine\",\"doi\":\"10.1016/j.str.2025.06.010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"How proteins move and deform determines their interactions with the environment and is thus of the utmost importance for cellular functioning. Following the revolution in single protein 3D structure prediction, researchers have focused on repurposing or developing deep learning models for sampling alternative protein conformations. In this work, we explored whether continuous compact representations of protein motions could be predicted directly from sequences, without exploiting 3D structures. SeaMoon leverages protein language model (pLM) embeddings as input to a lightweight convolutional neural network. We assessed SeaMoon against <span><span style=\\\"\\\"><math><mrow is=\\\"true\\\"><mo is=\\\"true\\\">∼</mo></mrow></math></span><span style=\\\"font-size: 90%; display: inline-block;\\\" tabindex=\\\"0\\\"></span><script type=\\\"math/mml\\\"><math><mrow is=\\\"true\\\"><mo is=\\\"true\\\">∼</mo></mrow></math></script></span>1,000 collections of experimental conformations exhibiting diverse motions. It predicts at least one ground-truth motion with reasonable accuracy for 40% of the test proteins. SeaMoon captures motions inaccessible to normal mode analysis, an unsupervised physics-based method relying solely on 3D geometry, and generalizes to proteins without detectable sequence similarity to the training set. SeaMoon is easily retrainable with novel or updated pLMs.\",\"PeriodicalId\":22168,\"journal\":{\"name\":\"Structure\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Structure\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.str.2025.06.010\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Structure","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.str.2025.06.010","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
SeaMoon: From protein language models to continuous structural heterogeneity
How proteins move and deform determines their interactions with the environment and is thus of the utmost importance for cellular functioning. Following the revolution in single protein 3D structure prediction, researchers have focused on repurposing or developing deep learning models for sampling alternative protein conformations. In this work, we explored whether continuous compact representations of protein motions could be predicted directly from sequences, without exploiting 3D structures. SeaMoon leverages protein language model (pLM) embeddings as input to a lightweight convolutional neural network. We assessed SeaMoon against 1,000 collections of experimental conformations exhibiting diverse motions. It predicts at least one ground-truth motion with reasonable accuracy for 40% of the test proteins. SeaMoon captures motions inaccessible to normal mode analysis, an unsupervised physics-based method relying solely on 3D geometry, and generalizes to proteins without detectable sequence similarity to the training set. SeaMoon is easily retrainable with novel or updated pLMs.
期刊介绍:
Structure aims to publish papers of exceptional interest in the field of structural biology. The journal strives to be essential reading for structural biologists, as well as biologists and biochemists that are interested in macromolecular structure and function. Structure strongly encourages the submission of manuscripts that present structural and molecular insights into biological function and mechanism. Other reports that address fundamental questions in structural biology, such as structure-based examinations of protein evolution, folding, and/or design, will also be considered. We will consider the application of any method, experimental or computational, at high or low resolution, to conduct structural investigations, as long as the method is appropriate for the biological, functional, and mechanistic question(s) being addressed. Likewise, reports describing single-molecule analysis of biological mechanisms are welcome.
In general, the editors encourage submission of experimental structural studies that are enriched by an analysis of structure-activity relationships and will not consider studies that solely report structural information unless the structure or analysis is of exceptional and broad interest. Studies reporting only homology models, de novo models, or molecular dynamics simulations are also discouraged unless the models are informed by or validated by novel experimental data; rationalization of a large body of existing experimental evidence and making testable predictions based on a model or simulation is often not considered sufficient.