Rapid Assessment of Virtually Synthesizable Chemical Structures via Support Vector Machine Models.

IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL
Yuto Iwasaki, Tomoyuki Miyao
{"title":"Rapid Assessment of Virtually Synthesizable Chemical Structures via Support Vector Machine Models.","authors":"Yuto Iwasaki, Tomoyuki Miyao","doi":"10.1002/minf.70000","DOIUrl":null,"url":null,"abstract":"<p><p>Support vector machine (SVM) and support vector regression (SVR) are widely used for building quantitative structure-activity relationship models for small- and medium-sized datasets. Although SVM and SVR models can efficiently predict compound activity, evaluating billions of molecules remains challenging, which sometimes occurs when screening the virtual molecules derived through virtual synthesis. Herein, we present an SVM-/SVR-based method for screening virtually synthesizable molecules based on their reactants. The proposed method employs a combination of reactant-wise kernel functions for fast evaluation without sacrificing prediction accuracy. Tested on 120 small molecular activity datasets against 10 macromolecule targets, the proposed SVR models with data augmentation worked equally to standard SVR models with the Tanimoto kernel. As a demonstration, exhaustive 6.4 × 10<sup>12</sup> reactant combinations were evaluated by an SVR model within 8 days on a single desktop computer, enabling large-scale screening without sampling.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e202500039"},"PeriodicalIF":2.8000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12278806/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/minf.70000","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0

Abstract

Support vector machine (SVM) and support vector regression (SVR) are widely used for building quantitative structure-activity relationship models for small- and medium-sized datasets. Although SVM and SVR models can efficiently predict compound activity, evaluating billions of molecules remains challenging, which sometimes occurs when screening the virtual molecules derived through virtual synthesis. Herein, we present an SVM-/SVR-based method for screening virtually synthesizable molecules based on their reactants. The proposed method employs a combination of reactant-wise kernel functions for fast evaluation without sacrificing prediction accuracy. Tested on 120 small molecular activity datasets against 10 macromolecule targets, the proposed SVR models with data augmentation worked equally to standard SVR models with the Tanimoto kernel. As a demonstration, exhaustive 6.4 × 1012 reactant combinations were evaluated by an SVR model within 8 days on a single desktop computer, enabling large-scale screening without sampling.

基于支持向量机模型的虚拟合成化学结构快速评估。
支持向量机(SVM)和支持向量回归(SVR)被广泛用于构建中小型数据集的定量结构-活动关系模型。虽然支持向量机和支持向量回归模型可以有效地预测化合物活性,但评估数十亿个分子仍然具有挑战性,有时在筛选通过虚拟合成衍生的虚拟分子时出现这种情况。在此,我们提出了一种基于SVM / svr的方法来筛选基于其反应物的虚拟合成分子。所提出的方法在不牺牲预测精度的情况下,采用组合反应物核函数进行快速评估。在针对10个大分子目标的120个小分子活性数据集上进行了测试,结果表明,该模型与基于谷本核的标准SVR模型具有相同的效果。作为示范,在一台台式计算机上,用SVR模型在8天内评估了详尽的6.4 × 1012种反应物组合,实现了大规模的不抽样筛选。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Informatics
Molecular Informatics CHEMISTRY, MEDICINAL-MATHEMATICAL & COMPUTATIONAL BIOLOGY
CiteScore
7.30
自引率
2.80%
发文量
70
审稿时长
3 months
期刊介绍: Molecular Informatics is a peer-reviewed, international forum for publication of high-quality, interdisciplinary research on all molecular aspects of bio/cheminformatics and computer-assisted molecular design. Molecular Informatics succeeded QSAR & Combinatorial Science in 2010. Molecular Informatics presents methodological innovations that will lead to a deeper understanding of ligand-receptor interactions, macromolecular complexes, molecular networks, design concepts and processes that demonstrate how ideas and design concepts lead to molecules with a desired structure or function, preferably including experimental validation. The journal''s scope includes but is not limited to the fields of drug discovery and chemical biology, protein and nucleic acid engineering and design, the design of nanomolecular structures, strategies for modeling of macromolecular assemblies, molecular networks and systems, pharmaco- and chemogenomics, computer-assisted screening strategies, as well as novel technologies for the de novo design of biologically active molecules. As a unique feature Molecular Informatics publishes so-called "Methods Corner" review-type articles which feature important technological concepts and advances within the scope of the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信