Challenging Reaction Prediction Models to Generalize to Novel Chemistry

IF 12.7 1区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

ACS Central Science Pub Date : 2025-03-12 DOI:10.1021/acscentsci.5c0005510.1021/acscentsci.5c00055

John Bradshaw, Anji Zhang, Babak Mahjour, David E. Graff, Marwin H. S. Segler and Connor W. Coley*,

{"title":"Challenging Reaction Prediction Models to Generalize to Novel Chemistry","authors":"John Bradshaw, Anji Zhang, Babak Mahjour, David E. Graff, Marwin H. S. Segler and Connor W. Coley*, ","doi":"10.1021/acscentsci.5c0005510.1021/acscentsci.5c00055","DOIUrl":null,"url":null,"abstract":"Deep learning models for anticipating the products of organic reactions have found many use cases, including validating retrosynthetic pathways and constraining synthesis-based molecular design tools. Despite compelling performance on popular benchmark tasks, strange and erroneous predictions sometimes ensue when using these models in practice. The core issue is that common benchmarks test models in an in-distribution setting, whereas many real-world uses for these models are in out-of-distribution settings and require a greater degree of extrapolation. To better understand how current reaction predictors work in out-of-distribution domains, we report a series of more challenging evaluations of a prototypical SMILES-based deep learning model. First, we illustrate how performance on randomly sampled data sets is overly optimistic compared to performance when generalizing to new patents or new authors. Second, we conduct time splits that evaluate how models perform when tested on reactions published years after those in their training set, mimicking real-world deployment. Finally, we consider extrapolation across reaction classes to reflect what would be required for the discovery of novel reaction types. This panel of tasks can reveal the capabilities and limitations of today’s reaction predictors, acting as a crucial first step in the development of tomorrow’s next-generation models capable of reaction discovery.Despite excellent benchmark performance, ML models for reaction prediction can struggle on real-world data─we evaluate these limitations by challenging a model on different out-of-distribution tasks.","PeriodicalId":10,"journal":{"name":"ACS Central Science","volume":"11 4","pages":"539–549 539–549"},"PeriodicalIF":12.7000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acscentsci.5c00055","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Central Science","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acscentsci.5c00055","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning models for anticipating the products of organic reactions have found many use cases, including validating retrosynthetic pathways and constraining synthesis-based molecular design tools. Despite compelling performance on popular benchmark tasks, strange and erroneous predictions sometimes ensue when using these models in practice. The core issue is that common benchmarks test models in an in-distribution setting, whereas many real-world uses for these models are in out-of-distribution settings and require a greater degree of extrapolation. To better understand how current reaction predictors work in out-of-distribution domains, we report a series of more challenging evaluations of a prototypical SMILES-based deep learning model. First, we illustrate how performance on randomly sampled data sets is overly optimistic compared to performance when generalizing to new patents or new authors. Second, we conduct time splits that evaluate how models perform when tested on reactions published years after those in their training set, mimicking real-world deployment. Finally, we consider extrapolation across reaction classes to reflect what would be required for the discovery of novel reaction types. This panel of tasks can reveal the capabilities and limitations of today’s reaction predictors, acting as a crucial first step in the development of tomorrow’s next-generation models capable of reaction discovery.

Despite excellent benchmark performance, ML models for reaction prediction can struggle on real-world data─we evaluate these limitations by challenging a model on different out-of-distribution tasks.

查看原文本刊更多论文

具有挑战性的反应预测模型推广到新的化学

用于预测有机反应产物的深度学习模型已经发现了许多用例，包括验证反合成途径和限制基于合成的分子设计工具。尽管在流行的基准测试任务上表现出色，但在实践中使用这些模型时，有时会出现奇怪和错误的预测。核心问题是，通用基准测试模型是在分布内设置的，而这些模型的许多实际使用是在分布外设置的，需要更大程度的外推。为了更好地理解当前的反应预测器是如何在分布外域工作的，我们报告了一系列更具挑战性的基于smiles的原型深度学习模型的评估。首先，我们说明了与推广到新专利或新作者时的性能相比，随机抽样数据集的性能如何过于乐观。其次，我们进行时间分割，评估模型在模拟真实世界部署的训练集中发布的反应数年后的反应测试时的表现。最后，我们考虑跨反应类的外推，以反映发现新反应类型所需的条件。这组任务可以揭示当今反应预测器的能力和局限性，作为未来下一代反应发现模型开发的关键第一步。尽管具有出色的基准性能，但用于反应预测的ML模型在现实世界的数据上可能会遇到困难──我们通过在不同的分布外任务上挑战模型来评估这些局限性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACS Central Science Chemical Engineering-General Chemical Engineering

CiteScore

25.50

自引率

0.50%

发文量

194

审稿时长

10 weeks

期刊介绍： ACS Central Science publishes significant primary reports on research in chemistry and allied fields where chemical approaches are pivotal. As the first fully open-access journal by the American Chemical Society, it covers compelling and important contributions to the broad chemistry and scientific community. "Central science," a term popularized nearly 40 years ago, emphasizes chemistry's central role in connecting physical and life sciences, and fundamental sciences with applied disciplines like medicine and engineering. The journal focuses on exceptional quality articles, addressing advances in fundamental chemistry and interdisciplinary research.