基于图形残差的分子特性预测方法

Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar
{"title":"基于图形残差的分子特性预测方法","authors":"Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar","doi":"arxiv-2408.03342","DOIUrl":null,"url":null,"abstract":"Property prediction of materials has recently been of high interest in the\nrecent years in the field of material science. Various Physics-based and\nMachine Learning models have already been developed, that can give good\nresults. However, they are not accurate enough and are inadequate for critical\napplications. The traditional machine learning models try to predict properties\nbased on the features extracted from the molecules, which are not easily\navailable most of the time. In this paper, a recently developed novel Deep\nLearning method, the Graph Neural Network (GNN), has been applied, allowing us\nto predict properties directly only the Graph-based structures of the\nmolecules. SMILES (Simplified Molecular Input Line Entry System) representation\nof the molecules has been used in the present study as input data format, which\nhas been further converted into a graph database, which constitutes the\ntraining data. This article highlights the detailed description of the novel\nGRU-based methodology to map the inputs that have been used. Emphasis on\nhighlighting both the regressive property as well as the classification-based\nproperty of the GNN backbone. A detailed description of the Variational\nAutoencoder (VAE) and the end-to-end learning method has been given to\nhighlight the multi-class multi-label property prediction of the backbone. The\nresults have been compared with standard benchmark datasets as well as some\nnewly developed datasets. All performance metrics which have been used have\nbeen clearly defined as well as their reason for choice. Keywords: GNN, VAE,\nSMILES, multi-label multi-class classification, GRU","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph Residual based Method for Molecular Property Prediction\",\"authors\":\"Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar\",\"doi\":\"arxiv-2408.03342\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Property prediction of materials has recently been of high interest in the\\nrecent years in the field of material science. Various Physics-based and\\nMachine Learning models have already been developed, that can give good\\nresults. However, they are not accurate enough and are inadequate for critical\\napplications. The traditional machine learning models try to predict properties\\nbased on the features extracted from the molecules, which are not easily\\navailable most of the time. In this paper, a recently developed novel Deep\\nLearning method, the Graph Neural Network (GNN), has been applied, allowing us\\nto predict properties directly only the Graph-based structures of the\\nmolecules. SMILES (Simplified Molecular Input Line Entry System) representation\\nof the molecules has been used in the present study as input data format, which\\nhas been further converted into a graph database, which constitutes the\\ntraining data. This article highlights the detailed description of the novel\\nGRU-based methodology to map the inputs that have been used. Emphasis on\\nhighlighting both the regressive property as well as the classification-based\\nproperty of the GNN backbone. A detailed description of the Variational\\nAutoencoder (VAE) and the end-to-end learning method has been given to\\nhighlight the multi-class multi-label property prediction of the backbone. The\\nresults have been compared with standard benchmark datasets as well as some\\nnewly developed datasets. All performance metrics which have been used have\\nbeen clearly defined as well as their reason for choice. Keywords: GNN, VAE,\\nSMILES, multi-label multi-class classification, GRU\",\"PeriodicalId\":501266,\"journal\":{\"name\":\"arXiv - QuanBio - Quantitative Methods\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Quantitative Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.03342\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.03342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,材料性能预测在材料科学领域受到高度关注。各种基于物理学和机器学习的模型已经被开发出来,并能给出很好的结果。然而,这些模型不够精确,不足以满足关键应用的需要。传统的机器学习模型试图根据从分子中提取的特征来预测特性,而这些特征在大多数情况下并不容易获得。本文应用了最近开发的一种新型深度学习方法--图神经网络(GNN),它允许我们仅根据分子的图式结构直接预测性质。本研究将分子的 SMILES(简化分子输入行输入系统)表示法用作输入数据格式,并将其进一步转换成图数据库,构成训练数据。本文重点详细描述了绘制所使用的输入数据的基于GRU 的新方法。重点强调了 GNN 主干网的回归特性和基于分类的特性。详细介绍了变异自动编码器(VAE)和端到端学习方法,以突出骨干网的多类别多标签属性预测。结果已与标准基准数据集和一些新开发的数据集进行了比较。使用的所有性能指标都有明确定义及其选择理由。关键词GNN、VAE、SMILES、多标签多类分类、GRU
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Graph Residual based Method for Molecular Property Prediction
Property prediction of materials has recently been of high interest in the recent years in the field of material science. Various Physics-based and Machine Learning models have already been developed, that can give good results. However, they are not accurate enough and are inadequate for critical applications. The traditional machine learning models try to predict properties based on the features extracted from the molecules, which are not easily available most of the time. In this paper, a recently developed novel Deep Learning method, the Graph Neural Network (GNN), has been applied, allowing us to predict properties directly only the Graph-based structures of the molecules. SMILES (Simplified Molecular Input Line Entry System) representation of the molecules has been used in the present study as input data format, which has been further converted into a graph database, which constitutes the training data. This article highlights the detailed description of the novel GRU-based methodology to map the inputs that have been used. Emphasis on highlighting both the regressive property as well as the classification-based property of the GNN backbone. A detailed description of the Variational Autoencoder (VAE) and the end-to-end learning method has been given to highlight the multi-class multi-label property prediction of the backbone. The results have been compared with standard benchmark datasets as well as some newly developed datasets. All performance metrics which have been used have been clearly defined as well as their reason for choice. Keywords: GNN, VAE, SMILES, multi-label multi-class classification, GRU
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信