Thanh-Hoang Nguyen-Vo, Trang T. T. Do, Binh P. Nguyen
{"title":"ResGAT: Residual Graph Attention Networks for molecular property prediction","authors":"Thanh-Hoang Nguyen-Vo, Trang T. T. Do, Binh P. Nguyen","doi":"10.1007/s12293-024-00423-5","DOIUrl":null,"url":null,"abstract":"<p>Molecular property prediction is an important step in the drug discovery pipeline. Numerous computational methods have been developed to predict a wide range of molecular properties. While recent approaches have shown promising results, no single architecture can comprehensively address all tasks, making this area persistently challenging and requiring substantial time and effort. Beyond traditional machine learning and deep learning architectures for regular data, several deep learning architectures have been designed for graph-structured data to overcome the limitations of conventional methods. Utilizing graph-structured data in quantitative structure–activity relationship (QSAR) modeling allows models to effectively extract unique features, especially where connectivity information is crucial. In our study, we developed residual graph attention networks (ResGAT), a deep learning architecture for molecular graph-structured data. This architecture is a combination of graph attention networks and shortcut connections to address both regression and classification problems. It is also customizable to adapt to various dataset sizes, enhancing the learning process based on molecular patterns. When tested multiple times with both random and scaffold sampling strategies on nine benchmark molecular datasets, QSAR models developed using ResGAT demonstrated stability and competitive performance compared to state-of-the-art methods.</p>","PeriodicalId":48780,"journal":{"name":"Memetic Computing","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Memetic Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12293-024-00423-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Molecular property prediction is an important step in the drug discovery pipeline. Numerous computational methods have been developed to predict a wide range of molecular properties. While recent approaches have shown promising results, no single architecture can comprehensively address all tasks, making this area persistently challenging and requiring substantial time and effort. Beyond traditional machine learning and deep learning architectures for regular data, several deep learning architectures have been designed for graph-structured data to overcome the limitations of conventional methods. Utilizing graph-structured data in quantitative structure–activity relationship (QSAR) modeling allows models to effectively extract unique features, especially where connectivity information is crucial. In our study, we developed residual graph attention networks (ResGAT), a deep learning architecture for molecular graph-structured data. This architecture is a combination of graph attention networks and shortcut connections to address both regression and classification problems. It is also customizable to adapt to various dataset sizes, enhancing the learning process based on molecular patterns. When tested multiple times with both random and scaffold sampling strategies on nine benchmark molecular datasets, QSAR models developed using ResGAT demonstrated stability and competitive performance compared to state-of-the-art methods.
Memetic ComputingCOMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-OPERATIONS RESEARCH & MANAGEMENT SCIENCE
CiteScore
6.80
自引率
12.80%
发文量
31
期刊介绍:
Memes have been defined as basic units of transferrable information that reside in the brain and are propagated across populations through the process of imitation. From an algorithmic point of view, memes have come to be regarded as building-blocks of prior knowledge, expressed in arbitrary computational representations (e.g., local search heuristics, fuzzy rules, neural models, etc.), that have been acquired through experience by a human or machine, and can be imitated (i.e., reused) across problems.
The Memetic Computing journal welcomes papers incorporating the aforementioned socio-cultural notion of memes into artificial systems, with particular emphasis on enhancing the efficacy of computational and artificial intelligence techniques for search, optimization, and machine learning through explicit prior knowledge incorporation. The goal of the journal is to thus be an outlet for high quality theoretical and applied research on hybrid, knowledge-driven computational approaches that may be characterized under any of the following categories of memetics:
Type 1: General-purpose algorithms integrated with human-crafted heuristics that capture some form of prior domain knowledge; e.g., traditional memetic algorithms hybridizing evolutionary global search with a problem-specific local search.
Type 2: Algorithms with the ability to automatically select, adapt, and reuse the most appropriate heuristics from a diverse pool of available choices; e.g., learning a mapping between global search operators and multiple local search schemes, given an optimization problem at hand.
Type 3: Algorithms that autonomously learn with experience, adaptively reusing data and/or machine learning models drawn from related problems as prior knowledge in new target tasks of interest; examples include, but are not limited to, transfer learning and optimization, multi-task learning and optimization, or any other multi-X evolutionary learning and optimization methodologies.