QM assisted ML for 19F NMR chemical shift prediction

IF 3 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Patrick Penner, Anna Vulpetti
{"title":"QM assisted ML for 19F NMR chemical shift prediction","authors":"Patrick Penner,&nbsp;Anna Vulpetti","doi":"10.1007/s10822-023-00542-0","DOIUrl":null,"url":null,"abstract":"<div><div><h3>Background</h3><p>Ligand-observed 19F NMR detection is an efficient method for screening libraries of fluorinated molecules in fragment-based drug design campaigns. Screening fluorinated molecules in large mixtures makes 19F NMR a high-throughput method. Typically, these mixtures are generated from pools of well-characterized fragments. By predicting 19F NMR chemical shift, mixtures could be generated for arbitrary fluorinated molecules facilitating for example focused screens.</p><h3>Methods</h3><p>In a previous publication, we introduced a method to predict 19F NMR chemical shift using rooted fluorine fingerprints and machine learning (ML) methods. Having observed that the quality of the prediction depends on similarity to the training set, we here propose to assist the prediction with quantum mechanics (QM) based methods in cases where compounds are not well covered by a training set.</p><h3>Results</h3><p>Beyond similarity, the performance of ML methods could be associated with individual features in compounds. A combination of both could be used as a procedure to split input data sets into those that could be predicted by ML and those that required QM processing. We could show on a proprietary fluorinated fragment library, known as LEF (Local Environment of Fluorine), and a public Enamine data set of 19F NMR chemical shifts that ML and QM methods could synergize to outperform either method individually. Models built on Enamine data, as well as model building and QM workflow tools, can be found at https://github.com/PatrickPenner/lefshift and https://github.com/PatrickPenner/lefqm.</p></div></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-023-00542-0","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Ligand-observed 19F NMR detection is an efficient method for screening libraries of fluorinated molecules in fragment-based drug design campaigns. Screening fluorinated molecules in large mixtures makes 19F NMR a high-throughput method. Typically, these mixtures are generated from pools of well-characterized fragments. By predicting 19F NMR chemical shift, mixtures could be generated for arbitrary fluorinated molecules facilitating for example focused screens.

Methods

In a previous publication, we introduced a method to predict 19F NMR chemical shift using rooted fluorine fingerprints and machine learning (ML) methods. Having observed that the quality of the prediction depends on similarity to the training set, we here propose to assist the prediction with quantum mechanics (QM) based methods in cases where compounds are not well covered by a training set.

Results

Beyond similarity, the performance of ML methods could be associated with individual features in compounds. A combination of both could be used as a procedure to split input data sets into those that could be predicted by ML and those that required QM processing. We could show on a proprietary fluorinated fragment library, known as LEF (Local Environment of Fluorine), and a public Enamine data set of 19F NMR chemical shifts that ML and QM methods could synergize to outperform either method individually. Models built on Enamine data, as well as model building and QM workflow tools, can be found at https://github.com/PatrickPenner/lefshift and https://github.com/PatrickPenner/lefqm.

Abstract Image

用于 19F NMR 化学位移预测的 QM 辅助 ML
背景配体观察 19F NMR 检测是在基于片段的药物设计活动中筛选含氟分子库的一种有效方法。在大量混合物中筛选含氟分子使 19F NMR 成为一种高通量方法。通常情况下,这些混合物是由特性良好的片段池生成的。通过预测 19F NMR 化学位移,可以生成任意含氟分子的混合物,从而促进重点筛选等工作。方法在之前的出版物中,我们介绍了一种使用根氟指纹和机器学习 (ML) 方法预测 19F NMR 化学位移的方法。在观察到预测质量取决于与训练集的相似性之后,我们在此建议在化合物未被训练集很好覆盖的情况下使用基于量子力学(QM)的方法辅助预测。两者的结合可作为一种程序,将输入数据集分为可由 ML 预测的数据集和需要 QM 处理的数据集。我们可以在一个名为 LEF(氟的局部环境)的专有含氟片段库和一个公开的 Enamine 19F NMR 化学位移数据集上证明,ML 和 QM 方法可以协同作用,从而优于任何一种单独的方法。基于 Enamine 数据建立的模型以及模型构建和 QM 工作流程工具可在 https://github.com/PatrickPenner/lefshift 和 https://github.com/PatrickPenner/lefqm 上找到。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computer-Aided Molecular Design
Journal of Computer-Aided Molecular Design 生物-计算机:跨学科应用
CiteScore
8.00
自引率
8.60%
发文量
56
审稿时长
3 months
期刊介绍: The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas: - theoretical chemistry; - computational chemistry; - computer and molecular graphics; - molecular modeling; - protein engineering; - drug design; - expert systems; - general structure-property relationships; - molecular dynamics; - chemical database development and usage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信