基于图形残差的分子特性预测方法

arXiv - QuanBio - Quantitative Methods Pub Date : 2024-07-27 DOI:arxiv-2408.03342

Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar

{"title":"基于图形残差的分子特性预测方法","authors":"Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar","doi":"arxiv-2408.03342","DOIUrl":null,"url":null,"abstract":"Property prediction of materials has recently been of high interest in the\nrecent years in the field of material science. Various Physics-based and\nMachine Learning models have already been developed, that can give good\nresults. However, they are not accurate enough and are inadequate for critical\napplications. The traditional machine learning models try to predict properties\nbased on the features extracted from the molecules, which are not easily\navailable most of the time. In this paper, a recently developed novel Deep\nLearning method, the Graph Neural Network (GNN), has been applied, allowing us\nto predict properties directly only the Graph-based structures of the\nmolecules. SMILES (Simplified Molecular Input Line Entry System) representation\nof the molecules has been used in the present study as input data format, which\nhas been further converted into a graph database, which constitutes the\ntraining data. This article highlights the detailed description of the novel\nGRU-based methodology to map the inputs that have been used. Emphasis on\nhighlighting both the regressive property as well as the classification-based\nproperty of the GNN backbone. A detailed description of the Variational\nAutoencoder (VAE) and the end-to-end learning method has been given to\nhighlight the multi-class multi-label property prediction of the backbone. The\nresults have been compared with standard benchmark datasets as well as some\nnewly developed datasets. All performance metrics which have been used have\nbeen clearly defined as well as their reason for choice. Keywords: GNN, VAE,\nSMILES, multi-label multi-class classification, GRU","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph Residual based Method for Molecular Property Prediction\",\"authors\":\"Kanad Sen, Saksham Gupta, Abhishek Raj, Alankar Alankar\",\"doi\":\"arxiv-2408.03342\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Property prediction of materials has recently been of high interest in the\\nrecent years in the field of material science. Various Physics-based and\\nMachine Learning models have already been developed, that can give good\\nresults. However, they are not accurate enough and are inadequate for critical\\napplications. The traditional machine learning models try to predict properties\\nbased on the features extracted from the molecules, which are not easily\\navailable most of the time. In this paper, a recently developed novel Deep\\nLearning method, the Graph Neural Network (GNN), has been applied, allowing us\\nto predict properties directly only the Graph-based structures of the\\nmolecules. SMILES (Simplified Molecular Input Line Entry System) representation\\nof the molecules has been used in the present study as input data format, which\\nhas been further converted into a graph database, which constitutes the\\ntraining data. This article highlights the detailed description of the novel\\nGRU-based methodology to map the inputs that have been used. Emphasis on\\nhighlighting both the regressive property as well as the classification-based\\nproperty of the GNN backbone. A detailed description of the Variational\\nAutoencoder (VAE) and the end-to-end learning method has been given to\\nhighlight the multi-class multi-label property prediction of the backbone. The\\nresults have been compared with standard benchmark datasets as well as some\\nnewly developed datasets. All performance metrics which have been used have\\nbeen clearly defined as well as their reason for choice. Keywords: GNN, VAE,\\nSMILES, multi-label multi-class classification, GRU\",\"PeriodicalId\":501266,\"journal\":{\"name\":\"arXiv - QuanBio - Quantitative Methods\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Quantitative Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.03342\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.03342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，材料性能预测在材料科学领域受到高度关注。各种基于物理学和机器学习的模型已经被开发出来，并能给出很好的结果。然而，这些模型不够精确，不足以满足关键应用的需要。传统的机器学习模型试图根据从分子中提取的特征来预测特性，而这些特征在大多数情况下并不容易获得。本文应用了最近开发的一种新型深度学习方法--图神经网络（GNN），它允许我们仅根据分子的图式结构直接预测性质。本研究将分子的 SMILES（简化分子输入行输入系统）表示法用作输入数据格式，并将其进一步转换成图数据库，构成训练数据。本文重点详细描述了绘制所使用的输入数据的基于GRU 的新方法。重点强调了 GNN 主干网的回归特性和基于分类的特性。详细介绍了变异自动编码器（VAE）和端到端学习方法，以突出骨干网的多类别多标签属性预测。结果已与标准基准数据集和一些新开发的数据集进行了比较。使用的所有性能指标都有明确定义及其选择理由。关键词GNN、VAE、SMILES、多标签多类分类、GRU

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Graph Residual based Method for Molecular Property Prediction

Property prediction of materials has recently been of high interest in the recent years in the field of material science. Various Physics-based and Machine Learning models have already been developed, that can give good results. However, they are not accurate enough and are inadequate for critical applications. The traditional machine learning models try to predict properties based on the features extracted from the molecules, which are not easily available most of the time. In this paper, a recently developed novel Deep Learning method, the Graph Neural Network (GNN), has been applied, allowing us to predict properties directly only the Graph-based structures of the molecules. SMILES (Simplified Molecular Input Line Entry System) representation of the molecules has been used in the present study as input data format, which has been further converted into a graph database, which constitutes the training data. This article highlights the detailed description of the novel GRU-based methodology to map the inputs that have been used. Emphasis on highlighting both the regressive property as well as the classification-based property of the GNN backbone. A detailed description of the Variational Autoencoder (VAE) and the end-to-end learning method has been given to highlight the multi-class multi-label property prediction of the backbone. The results have been compared with standard benchmark datasets as well as some newly developed datasets. All performance metrics which have been used have been clearly defined as well as their reason for choice. Keywords: GNN, VAE, SMILES, multi-label multi-class classification, GRU

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - QuanBio - Quantitative Methods

自引率

0.00%

发文量