Deep electron cloud-activity and field-activity relationships

IF 2.3 4区 化学 Q1 SOCIAL WORK
Lu Xu, Qin Yang
{"title":"Deep electron cloud-activity and field-activity relationships","authors":"Lu Xu,&nbsp;Qin Yang","doi":"10.1002/cem.3503","DOIUrl":null,"url":null,"abstract":"<p>Chemists have been pursuing general mathematical laws to explain and predict molecular properties for a long time. However, most of the traditional quantitative structure-activity relationship (QSAR) models have limited application domains; for example, they tend to have poor generalization performance when applied to molecules with parent structures different from those of the trained molecules. This paper attempts to develop a new QSAR method that is theoretically possible to predict various properties of molecules with diverse structures. The proposed deep electron cloud-activity relationships (DECAR) and deep field-activity relationships (DFAR) methods consist of three essentials: (1) a large number of molecule entities with activity data as training objects and responses; (2) three-dimensional electron cloud density (ECD) or related field data by the accurate density functional theory methods as input descriptors; and (3) a deep learning model that is sufficiently flexible and powerful to learn the large data described above. DECAR and DFAR are used to distinguish 977 sweet and 1965 non-sweet molecules (with 6-fold data augmentation), and the classification performance is demonstrated to be significantly better than the traditional least squares support vector machine (LS-SVM) models using traditional descriptors. DECAR and DFAR would provide a possible way to establish a widely applicable, cumulative, and shareable artificial intelligence-driven QSAR system. They are likely to promote the development of an interactive platform to collect and share the accurate ECD and field data of millions of molecules with annotated activities. With enough input data, we envision the appearance of several deep networks trained for various molecular activities. Finally, we could anticipate a single DECAR or DFAR network to learn and infer various properties of interest for chemical molecules, which will become an open and shared learning and inference tool for chemists.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3503","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
引用次数: 0

Abstract

Chemists have been pursuing general mathematical laws to explain and predict molecular properties for a long time. However, most of the traditional quantitative structure-activity relationship (QSAR) models have limited application domains; for example, they tend to have poor generalization performance when applied to molecules with parent structures different from those of the trained molecules. This paper attempts to develop a new QSAR method that is theoretically possible to predict various properties of molecules with diverse structures. The proposed deep electron cloud-activity relationships (DECAR) and deep field-activity relationships (DFAR) methods consist of three essentials: (1) a large number of molecule entities with activity data as training objects and responses; (2) three-dimensional electron cloud density (ECD) or related field data by the accurate density functional theory methods as input descriptors; and (3) a deep learning model that is sufficiently flexible and powerful to learn the large data described above. DECAR and DFAR are used to distinguish 977 sweet and 1965 non-sweet molecules (with 6-fold data augmentation), and the classification performance is demonstrated to be significantly better than the traditional least squares support vector machine (LS-SVM) models using traditional descriptors. DECAR and DFAR would provide a possible way to establish a widely applicable, cumulative, and shareable artificial intelligence-driven QSAR system. They are likely to promote the development of an interactive platform to collect and share the accurate ECD and field data of millions of molecules with annotated activities. With enough input data, we envision the appearance of several deep networks trained for various molecular activities. Finally, we could anticipate a single DECAR or DFAR network to learn and infer various properties of interest for chemical molecules, which will become an open and shared learning and inference tool for chemists.

深电子云-活动和场-活动关系
长期以来,化学家一直在追求解释和预测分子性质的一般数学定律。然而,大多数传统的定量构效关系模型的应用领域有限;例如,当应用于具有与训练的分子不同的母体结构的分子时,它们往往具有较差的泛化性能。本文试图开发一种新的QSAR方法,该方法在理论上可以预测具有不同结构的分子的各种性质。所提出的深电子云活动关系(DECAR)和深场活动关系(DFAR)方法由三个要素组成:(1)以活动数据为训练对象和响应的大量分子实体;(2) 三维电子云密度(ECD)或相关场数据,通过精确的密度泛函理论方法作为输入描述符;以及(3)深度学习模型,其足够灵活和强大以学习上述大数据。DECAR和DFAR用于区分977个甜分子和1965个非甜分子(数据增加了6倍),其分类性能明显优于使用传统描述符的传统最小二乘支持向量机(LS-SVM)模型。DECAR和DFAR将提供一种可能的方式来建立一个广泛适用、累积和可共享的人工智能驱动的QSAR系统。它们可能会促进交互式平台的开发,以收集和共享数百万具有注释活性的分子的准确ECD和现场数据。有了足够的输入数据,我们设想出现几个为各种分子活动训练的深度网络。最后,我们可以预期一个单一的DECAR或DFAR网络来学习和推断化学分子的各种感兴趣的性质,这将成为化学家开放和共享的学习和推断工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Chemometrics
Journal of Chemometrics 化学-分析化学
CiteScore
5.20
自引率
8.30%
发文量
78
审稿时长
2 months
期刊介绍: The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信