{"title":"药物设计中的物理-人工智能对话。","authors":"Pablo Andrés Vargas-Rosales and Amedeo Caflisch","doi":"10.1039/D4MD00869C","DOIUrl":null,"url":null,"abstract":"<p >A long path has led from the determination of the first protein structure in 1960 to the recent breakthroughs in protein science. Protein structure prediction and design methodologies based on machine learning (ML) have been recognized with the 2024 Nobel prize in Chemistry, but they would not have been possible without previous work and the input of many domain scientists. Challenges remain in the application of ML tools for the prediction of structural ensembles and their usage within the software pipelines for structure determination by crystallography or cryogenic electron microscopy. In the drug discovery workflow, ML techniques are being used in diverse areas such as scoring of docked poses, or the generation of molecular descriptors. As the ML techniques become more widespread, novel applications emerge which can profit from the large amounts of data available. Nevertheless, it is essential to balance the potential advantages against the environmental costs of ML deployment to decide if and when it is best to apply it. For hit to lead optimization ML tools can efficiently interpolate between compounds in large chemical series but free energy calculations by molecular dynamics simulations seem to be superior for designing novel derivatives. Importantly, the potential complementarity and/or synergism of physics-based methods (<em>e.g.</em>, force field-based simulation models) and data-hungry ML techniques is growing strongly. Current ML methods have evolved from decades of research. It is now necessary for biologists, physicists, and computer scientists to fully understand advantages and limitations of ML techniques to ensure that the complementarity of physics-based methods and ML tools can be fully exploited for drug design.</p>","PeriodicalId":21462,"journal":{"name":"RSC medicinal chemistry","volume":" 4","pages":" 1499-1515"},"PeriodicalIF":4.1000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11788922/pdf/","citationCount":"0","resultStr":"{\"title\":\"The physics-AI dialogue in drug design†\",\"authors\":\"Pablo Andrés Vargas-Rosales and Amedeo Caflisch\",\"doi\":\"10.1039/D4MD00869C\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >A long path has led from the determination of the first protein structure in 1960 to the recent breakthroughs in protein science. Protein structure prediction and design methodologies based on machine learning (ML) have been recognized with the 2024 Nobel prize in Chemistry, but they would not have been possible without previous work and the input of many domain scientists. Challenges remain in the application of ML tools for the prediction of structural ensembles and their usage within the software pipelines for structure determination by crystallography or cryogenic electron microscopy. In the drug discovery workflow, ML techniques are being used in diverse areas such as scoring of docked poses, or the generation of molecular descriptors. As the ML techniques become more widespread, novel applications emerge which can profit from the large amounts of data available. Nevertheless, it is essential to balance the potential advantages against the environmental costs of ML deployment to decide if and when it is best to apply it. For hit to lead optimization ML tools can efficiently interpolate between compounds in large chemical series but free energy calculations by molecular dynamics simulations seem to be superior for designing novel derivatives. Importantly, the potential complementarity and/or synergism of physics-based methods (<em>e.g.</em>, force field-based simulation models) and data-hungry ML techniques is growing strongly. Current ML methods have evolved from decades of research. It is now necessary for biologists, physicists, and computer scientists to fully understand advantages and limitations of ML techniques to ensure that the complementarity of physics-based methods and ML tools can be fully exploited for drug design.</p>\",\"PeriodicalId\":21462,\"journal\":{\"name\":\"RSC medicinal chemistry\",\"volume\":\" 4\",\"pages\":\" 1499-1515\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-01-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11788922/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"RSC medicinal chemistry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2025/md/d4md00869c\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"RSC medicinal chemistry","FirstCategoryId":"3","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/md/d4md00869c","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
从 1960 年确定第一个蛋白质结构到最近蛋白质科学的突破,经历了漫长的道路。基于机器学习(ML)的蛋白质结构预测和设计方法荣获了 2024 年诺贝尔化学奖,但如果没有先前的工作和许多领域科学家的投入,这些成果是不可能实现的。在应用 ML 工具预测结构组合以及在通过晶体学或低温电子显微镜确定结构的软件管道中使用这些工具方面,仍然存在挑战。在药物发现工作流程中,ML 技术正被用于各种领域,如对接姿势评分或生成分子描述符。随着 ML 技术的普及,新的应用也不断涌现,这些应用可以从大量可用数据中获益。不过,在决定是否以及何时最好应用 ML 时,必须权衡部署 ML 的潜在优势与环境成本。对于先导化合物的优化,ML 工具可以有效地在大型化学系列中的化合物之间进行插值,但分子动力学模拟的自由能计算似乎更适合设计新型衍生物。重要的是,基于物理的方法(如基于力场的模拟模型)和对数据要求极高的 ML 技术之间的潜在互补性和/或协同性正在不断增强。当前的 ML 方法是在几十年的研究基础上发展起来的。现在,生物学家、物理学家和计算机科学家有必要充分了解 ML 技术的优势和局限性,以确保在药物设计中充分利用基于物理的方法和 ML 工具的互补性。
A long path has led from the determination of the first protein structure in 1960 to the recent breakthroughs in protein science. Protein structure prediction and design methodologies based on machine learning (ML) have been recognized with the 2024 Nobel prize in Chemistry, but they would not have been possible without previous work and the input of many domain scientists. Challenges remain in the application of ML tools for the prediction of structural ensembles and their usage within the software pipelines for structure determination by crystallography or cryogenic electron microscopy. In the drug discovery workflow, ML techniques are being used in diverse areas such as scoring of docked poses, or the generation of molecular descriptors. As the ML techniques become more widespread, novel applications emerge which can profit from the large amounts of data available. Nevertheless, it is essential to balance the potential advantages against the environmental costs of ML deployment to decide if and when it is best to apply it. For hit to lead optimization ML tools can efficiently interpolate between compounds in large chemical series but free energy calculations by molecular dynamics simulations seem to be superior for designing novel derivatives. Importantly, the potential complementarity and/or synergism of physics-based methods (e.g., force field-based simulation models) and data-hungry ML techniques is growing strongly. Current ML methods have evolved from decades of research. It is now necessary for biologists, physicists, and computer scientists to fully understand advantages and limitations of ML techniques to ensure that the complementarity of physics-based methods and ML tools can be fully exploited for drug design.