3D-Mol：利用三维信息进行分子特性预测的新型对比学习框架

IF 3.7 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Analysis and Applications Pub Date : 2024-06-21 DOI:10.1007/s10044-024-01287-8

Taojie Kuang, Yiming Ren, Zhixiang Ren

{"title":"3D-Mol：利用三维信息进行分子特性预测的新型对比学习框架","authors":"Taojie Kuang, Yiming Ren, Zhixiang Ren","doi":"10.1007/s10044-024-01287-8","DOIUrl":null,"url":null,"abstract":"<p>Molecular property prediction, crucial for early drug candidate screening and optimization, has seen advancements with deep learning-based methods. While deep learning-based methods have advanced considerably, they often fall short in fully leveraging 3D spatial information. Specifically, current molecular encoding techniques tend to inadequately extract spatial information, leading to ambiguous representations where a single one might represent multiple distinct molecules. Moreover, existing molecular modeling methods focus predominantly on the most stable 3D conformations, neglecting other viable conformations present in reality. To address these issues, we propose 3D-Mol, a novel approach designed for more accurate spatial structure representation. It deconstructs molecules into three hierarchical graphs to better extract geometric information. Additionally, 3D-Mol leverages contrastive learning for pretraining on 20 million unlabeled data, treating their conformations with identical topological structures as weighted positive pairs and contrasting ones as negatives, based on the similarity of their 3D conformation descriptors and fingerprints. We compare 3D-Mol with various state-of-the-art baselines on 7 benchmarks and demonstrate our outstanding performance.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"80 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3D-Mol: A Novel Contrastive Learning Framework for Molecular Property Prediction with 3D Information\",\"authors\":\"Taojie Kuang, Yiming Ren, Zhixiang Ren\",\"doi\":\"10.1007/s10044-024-01287-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Molecular property prediction, crucial for early drug candidate screening and optimization, has seen advancements with deep learning-based methods. While deep learning-based methods have advanced considerably, they often fall short in fully leveraging 3D spatial information. Specifically, current molecular encoding techniques tend to inadequately extract spatial information, leading to ambiguous representations where a single one might represent multiple distinct molecules. Moreover, existing molecular modeling methods focus predominantly on the most stable 3D conformations, neglecting other viable conformations present in reality. To address these issues, we propose 3D-Mol, a novel approach designed for more accurate spatial structure representation. It deconstructs molecules into three hierarchical graphs to better extract geometric information. Additionally, 3D-Mol leverages contrastive learning for pretraining on 20 million unlabeled data, treating their conformations with identical topological structures as weighted positive pairs and contrasting ones as negatives, based on the similarity of their 3D conformation descriptors and fingerprints. We compare 3D-Mol with various state-of-the-art baselines on 7 benchmarks and demonstrate our outstanding performance.</p>\",\"PeriodicalId\":54639,\"journal\":{\"name\":\"Pattern Analysis and Applications\",\"volume\":\"80 1\",\"pages\":\"\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Analysis and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10044-024-01287-8\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Analysis and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10044-024-01287-8","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

分子性质预测对早期候选药物的筛选和优化至关重要，基于深度学习的方法在这方面取得了进步。虽然基于深度学习的方法取得了长足的进步，但它们在充分利用三维空间信息方面往往存在不足。具体来说，目前的分子编码技术往往不能充分提取空间信息，导致表征模糊，一个表征可能代表多个不同的分子。此外，现有的分子建模方法主要关注最稳定的三维构象，而忽略了现实中存在的其他可行构象。为了解决这些问题，我们提出了 3D-Mol 这种新方法，旨在实现更精确的空间结构表示。它将分子解构为三个层次图，以更好地提取几何信息。此外，3D-Mol 还利用对比学习对 2000 万个未标记数据进行预训练，根据其三维构象描述符和指纹的相似性，将具有相同拓扑结构的构象视为加权正对，而将具有反差的构象视为负对。我们在 7 个基准上将 3D-Mol 与各种最先进的基线进行了比较，证明了我们的出色性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

3D-Mol: A Novel Contrastive Learning Framework for Molecular Property Prediction with 3D Information

查看原文本刊更多论文

3D-Mol: A Novel Contrastive Learning Framework for Molecular Property Prediction with 3D Information

Molecular property prediction, crucial for early drug candidate screening and optimization, has seen advancements with deep learning-based methods. While deep learning-based methods have advanced considerably, they often fall short in fully leveraging 3D spatial information. Specifically, current molecular encoding techniques tend to inadequately extract spatial information, leading to ambiguous representations where a single one might represent multiple distinct molecules. Moreover, existing molecular modeling methods focus predominantly on the most stable 3D conformations, neglecting other viable conformations present in reality. To address these issues, we propose 3D-Mol, a novel approach designed for more accurate spatial structure representation. It deconstructs molecules into three hierarchical graphs to better extract geometric information. Additionally, 3D-Mol leverages contrastive learning for pretraining on 20 million unlabeled data, treating their conformations with identical topological structures as weighted positive pairs and contrasting ones as negatives, based on the similarity of their 3D conformation descriptors and fingerprints. We compare 3D-Mol with various state-of-the-art baselines on 7 benchmarks and demonstrate our outstanding performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pattern Analysis and Applications 工程技术-计算机：人工智能

CiteScore

7.40

自引率

2.60%

发文量

审稿时长

13.5 months

期刊介绍： The journal publishes high quality articles in areas of fundamental research in intelligent pattern analysis and applications in computer science and engineering. It aims to provide a forum for original research which describes novel pattern analysis techniques and industrial applications of the current technology. In addition, the journal will also publish articles on pattern analysis applications in medical imaging. The journal solicits articles that detail new technology and methods for pattern recognition and analysis in applied domains including, but not limited to, computer vision and image processing, speech analysis, robotics, multimedia, document analysis, character recognition, knowledge engineering for pattern recognition, fractal analysis, and intelligent control. The journal publishes articles on the use of advanced pattern recognition and analysis methods including statistical techniques, neural networks, genetic algorithms, fuzzy pattern recognition, machine learning, and hardware implementations which are either relevant to the development of pattern analysis as a research area or detail novel pattern analysis applications. Papers proposing new classifier systems or their development, pattern analysis systems for real-time applications, fuzzy and temporal pattern recognition and uncertainty management in applied pattern recognition are particularly solicited.