{"title":"QM assisted ML for 19F NMR chemical shift prediction","authors":"Patrick Penner, Anna Vulpetti","doi":"10.1007/s10822-023-00542-0","DOIUrl":null,"url":null,"abstract":"<div><div><h3>Background</h3><p>Ligand-observed 19F NMR detection is an efficient method for screening libraries of fluorinated molecules in fragment-based drug design campaigns. Screening fluorinated molecules in large mixtures makes 19F NMR a high-throughput method. Typically, these mixtures are generated from pools of well-characterized fragments. By predicting 19F NMR chemical shift, mixtures could be generated for arbitrary fluorinated molecules facilitating for example focused screens.</p><h3>Methods</h3><p>In a previous publication, we introduced a method to predict 19F NMR chemical shift using rooted fluorine fingerprints and machine learning (ML) methods. Having observed that the quality of the prediction depends on similarity to the training set, we here propose to assist the prediction with quantum mechanics (QM) based methods in cases where compounds are not well covered by a training set.</p><h3>Results</h3><p>Beyond similarity, the performance of ML methods could be associated with individual features in compounds. A combination of both could be used as a procedure to split input data sets into those that could be predicted by ML and those that required QM processing. We could show on a proprietary fluorinated fragment library, known as LEF (Local Environment of Fluorine), and a public Enamine data set of 19F NMR chemical shifts that ML and QM methods could synergize to outperform either method individually. Models built on Enamine data, as well as model building and QM workflow tools, can be found at https://github.com/PatrickPenner/lefshift and https://github.com/PatrickPenner/lefqm.</p></div></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-023-00542-0","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Ligand-observed 19F NMR detection is an efficient method for screening libraries of fluorinated molecules in fragment-based drug design campaigns. Screening fluorinated molecules in large mixtures makes 19F NMR a high-throughput method. Typically, these mixtures are generated from pools of well-characterized fragments. By predicting 19F NMR chemical shift, mixtures could be generated for arbitrary fluorinated molecules facilitating for example focused screens.
Methods
In a previous publication, we introduced a method to predict 19F NMR chemical shift using rooted fluorine fingerprints and machine learning (ML) methods. Having observed that the quality of the prediction depends on similarity to the training set, we here propose to assist the prediction with quantum mechanics (QM) based methods in cases where compounds are not well covered by a training set.
Results
Beyond similarity, the performance of ML methods could be associated with individual features in compounds. A combination of both could be used as a procedure to split input data sets into those that could be predicted by ML and those that required QM processing. We could show on a proprietary fluorinated fragment library, known as LEF (Local Environment of Fluorine), and a public Enamine data set of 19F NMR chemical shifts that ML and QM methods could synergize to outperform either method individually. Models built on Enamine data, as well as model building and QM workflow tools, can be found at https://github.com/PatrickPenner/lefshift and https://github.com/PatrickPenner/lefqm.
期刊介绍:
The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas:
- theoretical chemistry;
- computational chemistry;
- computer and molecular graphics;
- molecular modeling;
- protein engineering;
- drug design;
- expert systems;
- general structure-property relationships;
- molecular dynamics;
- chemical database development and usage.