Sharpless不对称二羟基化对映体选择性的数据驱动预测:模型开发和实验验证

IF 10.4 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Blake E. Ocampo, , , Bilal Altundas, , , Matthew J. Bock, , , Sara Feiz, , and , Scott E. Denmark*, 
{"title":"Sharpless不对称二羟基化对映体选择性的数据驱动预测:模型开发和实验验证","authors":"Blake E. Ocampo,&nbsp;, ,&nbsp;Bilal Altundas,&nbsp;, ,&nbsp;Matthew J. Bock,&nbsp;, ,&nbsp;Sara Feiz,&nbsp;, and ,&nbsp;Scott E. Denmark*,&nbsp;","doi":"10.1021/acscentsci.5c00900","DOIUrl":null,"url":null,"abstract":"<p >The Sharpless asymmetric dihydroxylation remains a key transformation in chemical synthesis, yet its success hides unexpected cases of lower selectivity. A chemoinformatic workflow was developed to allow data-driven analysis of the reaction. A database of 1007 reactions employing AD-mix α and β was curated from the literature, and an alignment-dependent, fragment-based featurization of alkenes was implemented for modeling. This platform converged on machine learning models capable of predicting the magnitude of enantioselectivity for multiple alkene classes, achieving <i>Q</i><sup>2</sup><sub>F3</sub> values ≥ 0.8, test <i>r</i><sup>2</sup> values ≥ 0.7 and mean absolute errors (MAE) ≤ 0.3 kcal/mol. The features of alkenes contributing to model performance were assessed with SHapley Additive exPlanations (SHAP) analysis to gather insight into factors underlying predictions. Experimental validation demonstrated that the models could achieve meaningful predictions on out-of-sample alkenes.</p><p >A data-driven approach was designed to analyze the Sharpless Asymmetric Dihydroxylation for insight into factors driving enantioselectivity and high-performing models were experimentally validated</p>","PeriodicalId":10,"journal":{"name":"ACS Central Science","volume":"11 9","pages":"1640–1650"},"PeriodicalIF":10.4000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/pdf/10.1021/acscentsci.5c00900","citationCount":"0","resultStr":"{\"title\":\"Data-Driven Prediction of Enantioselectivity for the Sharpless Asymmetric Dihydroxylation: Model Development and Experimental Validation\",\"authors\":\"Blake E. Ocampo,&nbsp;, ,&nbsp;Bilal Altundas,&nbsp;, ,&nbsp;Matthew J. Bock,&nbsp;, ,&nbsp;Sara Feiz,&nbsp;, and ,&nbsp;Scott E. Denmark*,&nbsp;\",\"doi\":\"10.1021/acscentsci.5c00900\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >The Sharpless asymmetric dihydroxylation remains a key transformation in chemical synthesis, yet its success hides unexpected cases of lower selectivity. A chemoinformatic workflow was developed to allow data-driven analysis of the reaction. A database of 1007 reactions employing AD-mix α and β was curated from the literature, and an alignment-dependent, fragment-based featurization of alkenes was implemented for modeling. This platform converged on machine learning models capable of predicting the magnitude of enantioselectivity for multiple alkene classes, achieving <i>Q</i><sup>2</sup><sub>F3</sub> values ≥ 0.8, test <i>r</i><sup>2</sup> values ≥ 0.7 and mean absolute errors (MAE) ≤ 0.3 kcal/mol. The features of alkenes contributing to model performance were assessed with SHapley Additive exPlanations (SHAP) analysis to gather insight into factors underlying predictions. Experimental validation demonstrated that the models could achieve meaningful predictions on out-of-sample alkenes.</p><p >A data-driven approach was designed to analyze the Sharpless Asymmetric Dihydroxylation for insight into factors driving enantioselectivity and high-performing models were experimentally validated</p>\",\"PeriodicalId\":10,\"journal\":{\"name\":\"ACS Central Science\",\"volume\":\"11 9\",\"pages\":\"1640–1650\"},\"PeriodicalIF\":10.4000,\"publicationDate\":\"2025-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/pdf/10.1021/acscentsci.5c00900\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Central Science\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acscentsci.5c00900\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Central Science","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acscentsci.5c00900","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

夏普利斯不对称二羟基化仍然是化学合成中的一个关键转变,但它的成功隐藏了意想不到的低选择性的情况。开发了一个化学信息学工作流,以允许对反应进行数据驱动分析。我们从文献中收集了1007个AD-mix α和β反应的数据库,并对烯烃进行了基于序列的片段特征化建模。该平台融合了能够预测多种烯烃类对映体选择性大小的机器学习模型,达到Q2F3值≥0.8,检验r2值≥0.7,平均绝对误差(MAE)≤0.3 kcal/mol。利用SHapley加性解释(SHAP)分析对影响模型性能的烯烃特征进行了评估,以深入了解预测的潜在因素。实验验证表明,该模型能够对样品外烯烃进行有意义的预测。设计了一种数据驱动的方法来分析Sharpless不对称二羟基化,以深入了解驱动对映体选择性的因素,并通过实验验证了高性能模型
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data-Driven Prediction of Enantioselectivity for the Sharpless Asymmetric Dihydroxylation: Model Development and Experimental Validation

The Sharpless asymmetric dihydroxylation remains a key transformation in chemical synthesis, yet its success hides unexpected cases of lower selectivity. A chemoinformatic workflow was developed to allow data-driven analysis of the reaction. A database of 1007 reactions employing AD-mix α and β was curated from the literature, and an alignment-dependent, fragment-based featurization of alkenes was implemented for modeling. This platform converged on machine learning models capable of predicting the magnitude of enantioselectivity for multiple alkene classes, achieving Q2F3 values ≥ 0.8, test r2 values ≥ 0.7 and mean absolute errors (MAE) ≤ 0.3 kcal/mol. The features of alkenes contributing to model performance were assessed with SHapley Additive exPlanations (SHAP) analysis to gather insight into factors underlying predictions. Experimental validation demonstrated that the models could achieve meaningful predictions on out-of-sample alkenes.

A data-driven approach was designed to analyze the Sharpless Asymmetric Dihydroxylation for insight into factors driving enantioselectivity and high-performing models were experimentally validated

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACS Central Science
ACS Central Science Chemical Engineering-General Chemical Engineering
CiteScore
25.50
自引率
0.50%
发文量
194
审稿时长
10 weeks
期刊介绍: ACS Central Science publishes significant primary reports on research in chemistry and allied fields where chemical approaches are pivotal. As the first fully open-access journal by the American Chemical Society, it covers compelling and important contributions to the broad chemistry and scientific community. "Central science," a term popularized nearly 40 years ago, emphasizes chemistry's central role in connecting physical and life sciences, and fundamental sciences with applied disciplines like medicine and engineering. The journal focuses on exceptional quality articles, addressing advances in fundamental chemistry and interdisciplinary research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信