Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar
{"title":"Graph Residual based Method for Molecular Property Prediction","authors":"Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar","doi":"arxiv-2408.03342","DOIUrl":null,"url":null,"abstract":"Property prediction of materials has recently been of high interest in the\nrecent years in the field of material science. Various Physics-based and\nMachine Learning models have already been developed, that can give good\nresults. However, they are not accurate enough and are inadequate for critical\napplications. The traditional machine learning models try to predict properties\nbased on the features extracted from the molecules, which are not easily\navailable most of the time. In this paper, a recently developed novel Deep\nLearning method, the Graph Neural Network (GNN), has been applied, allowing us\nto predict properties directly only the Graph-based structures of the\nmolecules. SMILES (Simplified Molecular Input Line Entry System) representation\nof the molecules has been used in the present study as input data format, which\nhas been further converted into a graph database, which constitutes the\ntraining data. This article highlights the detailed description of the novel\nGRU-based methodology to map the inputs that have been used. Emphasis on\nhighlighting both the regressive property as well as the classification-based\nproperty of the GNN backbone. A detailed description of the Variational\nAutoencoder (VAE) and the end-to-end learning method has been given to\nhighlight the multi-class multi-label property prediction of the backbone. The\nresults have been compared with standard benchmark datasets as well as some\nnewly developed datasets. All performance metrics which have been used have\nbeen clearly defined as well as their reason for choice. Keywords: GNN, VAE,\nSMILES, multi-label multi-class classification, GRU","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.03342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Property prediction of materials has recently been of high interest in the
recent years in the field of material science. Various Physics-based and
Machine Learning models have already been developed, that can give good
results. However, they are not accurate enough and are inadequate for critical
applications. The traditional machine learning models try to predict properties
based on the features extracted from the molecules, which are not easily
available most of the time. In this paper, a recently developed novel Deep
Learning method, the Graph Neural Network (GNN), has been applied, allowing us
to predict properties directly only the Graph-based structures of the
molecules. SMILES (Simplified Molecular Input Line Entry System) representation
of the molecules has been used in the present study as input data format, which
has been further converted into a graph database, which constitutes the
training data. This article highlights the detailed description of the novel
GRU-based methodology to map the inputs that have been used. Emphasis on
highlighting both the regressive property as well as the classification-based
property of the GNN backbone. A detailed description of the Variational
Autoencoder (VAE) and the end-to-end learning method has been given to
highlight the multi-class multi-label property prediction of the backbone. The
results have been compared with standard benchmark datasets as well as some
newly developed datasets. All performance metrics which have been used have
been clearly defined as well as their reason for choice. Keywords: GNN, VAE,
SMILES, multi-label multi-class classification, GRU