{"title":"深电子云-活动和场-活动关系","authors":"Lu Xu, Qin Yang","doi":"10.1002/cem.3503","DOIUrl":null,"url":null,"abstract":"<p>Chemists have been pursuing general mathematical laws to explain and predict molecular properties for a long time. However, most of the traditional quantitative structure-activity relationship (QSAR) models have limited application domains; for example, they tend to have poor generalization performance when applied to molecules with parent structures different from those of the trained molecules. This paper attempts to develop a new QSAR method that is theoretically possible to predict various properties of molecules with diverse structures. The proposed deep electron cloud-activity relationships (DECAR) and deep field-activity relationships (DFAR) methods consist of three essentials: (1) a large number of molecule entities with activity data as training objects and responses; (2) three-dimensional electron cloud density (ECD) or related field data by the accurate density functional theory methods as input descriptors; and (3) a deep learning model that is sufficiently flexible and powerful to learn the large data described above. DECAR and DFAR are used to distinguish 977 sweet and 1965 non-sweet molecules (with 6-fold data augmentation), and the classification performance is demonstrated to be significantly better than the traditional least squares support vector machine (LS-SVM) models using traditional descriptors. DECAR and DFAR would provide a possible way to establish a widely applicable, cumulative, and shareable artificial intelligence-driven QSAR system. They are likely to promote the development of an interactive platform to collect and share the accurate ECD and field data of millions of molecules with annotated activities. With enough input data, we envision the appearance of several deep networks trained for various molecular activities. Finally, we could anticipate a single DECAR or DFAR network to learn and infer various properties of interest for chemical molecules, which will become an open and shared learning and inference tool for chemists.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep electron cloud-activity and field-activity relationships\",\"authors\":\"Lu Xu, Qin Yang\",\"doi\":\"10.1002/cem.3503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Chemists have been pursuing general mathematical laws to explain and predict molecular properties for a long time. However, most of the traditional quantitative structure-activity relationship (QSAR) models have limited application domains; for example, they tend to have poor generalization performance when applied to molecules with parent structures different from those of the trained molecules. This paper attempts to develop a new QSAR method that is theoretically possible to predict various properties of molecules with diverse structures. The proposed deep electron cloud-activity relationships (DECAR) and deep field-activity relationships (DFAR) methods consist of three essentials: (1) a large number of molecule entities with activity data as training objects and responses; (2) three-dimensional electron cloud density (ECD) or related field data by the accurate density functional theory methods as input descriptors; and (3) a deep learning model that is sufficiently flexible and powerful to learn the large data described above. DECAR and DFAR are used to distinguish 977 sweet and 1965 non-sweet molecules (with 6-fold data augmentation), and the classification performance is demonstrated to be significantly better than the traditional least squares support vector machine (LS-SVM) models using traditional descriptors. DECAR and DFAR would provide a possible way to establish a widely applicable, cumulative, and shareable artificial intelligence-driven QSAR system. They are likely to promote the development of an interactive platform to collect and share the accurate ECD and field data of millions of molecules with annotated activities. With enough input data, we envision the appearance of several deep networks trained for various molecular activities. Finally, we could anticipate a single DECAR or DFAR network to learn and infer various properties of interest for chemical molecules, which will become an open and shared learning and inference tool for chemists.</p>\",\"PeriodicalId\":15274,\"journal\":{\"name\":\"Journal of Chemometrics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemometrics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cem.3503\",\"RegionNum\":4,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOCIAL WORK\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3503","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
Deep electron cloud-activity and field-activity relationships
Chemists have been pursuing general mathematical laws to explain and predict molecular properties for a long time. However, most of the traditional quantitative structure-activity relationship (QSAR) models have limited application domains; for example, they tend to have poor generalization performance when applied to molecules with parent structures different from those of the trained molecules. This paper attempts to develop a new QSAR method that is theoretically possible to predict various properties of molecules with diverse structures. The proposed deep electron cloud-activity relationships (DECAR) and deep field-activity relationships (DFAR) methods consist of three essentials: (1) a large number of molecule entities with activity data as training objects and responses; (2) three-dimensional electron cloud density (ECD) or related field data by the accurate density functional theory methods as input descriptors; and (3) a deep learning model that is sufficiently flexible and powerful to learn the large data described above. DECAR and DFAR are used to distinguish 977 sweet and 1965 non-sweet molecules (with 6-fold data augmentation), and the classification performance is demonstrated to be significantly better than the traditional least squares support vector machine (LS-SVM) models using traditional descriptors. DECAR and DFAR would provide a possible way to establish a widely applicable, cumulative, and shareable artificial intelligence-driven QSAR system. They are likely to promote the development of an interactive platform to collect and share the accurate ECD and field data of millions of molecules with annotated activities. With enough input data, we envision the appearance of several deep networks trained for various molecular activities. Finally, we could anticipate a single DECAR or DFAR network to learn and infer various properties of interest for chemical molecules, which will become an open and shared learning and inference tool for chemists.
期刊介绍:
The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.