{"title":"利用拓扑学和几何学进行自我监督分子表征学习","authors":"Xuan Zang, Junjie Zhang, Buzhou Tang","doi":"10.1109/JBHI.2024.3479194","DOIUrl":null,"url":null,"abstract":"<p><p>Molecular representation learning is of great importance for drug molecular analysis. The development in molecular representation learning has demonstrated great promise through self-supervised pre-training strategy to overcome the scarcity of labeled molecular property data. Recent studies concentrate on pre-training molecular representation encoders by integrating both 2D topological and 3D geometric structures. However, existing methods rely on molecule-level or atom-level alignment for different views, while overlooking hierarchical self-supervised learning to capture both inter-molecule and intra-molecule correlation. Additionally, most methods employ 2D or 3D encoders to individually extract molecular characteristics locally or globally for molecular property prediction. The potential for effectively fusing these two molecular representations remains to be explored. In this work, we propose a Multi-View Molecular Representation Learning method (MVMRL) for molecular property prediction. First, hierarchical pre-training pretext tasks are designed, including fine-grained atom-level tasks for 2D molecular graphs as well as coarse-grained molecule-level tasks for 3D molecular graphs to provide complementary information to each other. Subsequently, a motif-level fusion pattern of multi-view molecular representations is presented during fine-tuning to enhance the performance of molecular property prediction. We evaluate the effectiveness of the proposed MVMRL by comparing with state-of-the-art baselines on molecular property prediction tasks, and the experimental results demonstrate the superiority of MVMRL.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-Supervised Molecular Representation Learning With Topology and Geometry.\",\"authors\":\"Xuan Zang, Junjie Zhang, Buzhou Tang\",\"doi\":\"10.1109/JBHI.2024.3479194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Molecular representation learning is of great importance for drug molecular analysis. The development in molecular representation learning has demonstrated great promise through self-supervised pre-training strategy to overcome the scarcity of labeled molecular property data. Recent studies concentrate on pre-training molecular representation encoders by integrating both 2D topological and 3D geometric structures. However, existing methods rely on molecule-level or atom-level alignment for different views, while overlooking hierarchical self-supervised learning to capture both inter-molecule and intra-molecule correlation. Additionally, most methods employ 2D or 3D encoders to individually extract molecular characteristics locally or globally for molecular property prediction. The potential for effectively fusing these two molecular representations remains to be explored. In this work, we propose a Multi-View Molecular Representation Learning method (MVMRL) for molecular property prediction. First, hierarchical pre-training pretext tasks are designed, including fine-grained atom-level tasks for 2D molecular graphs as well as coarse-grained molecule-level tasks for 3D molecular graphs to provide complementary information to each other. Subsequently, a motif-level fusion pattern of multi-view molecular representations is presented during fine-tuning to enhance the performance of molecular property prediction. We evaluate the effectiveness of the proposed MVMRL by comparing with state-of-the-art baselines on molecular property prediction tasks, and the experimental results demonstrate the superiority of MVMRL.</p>\",\"PeriodicalId\":13073,\"journal\":{\"name\":\"IEEE Journal of Biomedical and Health Informatics\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2024-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Biomedical and Health Informatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/JBHI.2024.3479194\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2024.3479194","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Self-Supervised Molecular Representation Learning With Topology and Geometry.
Molecular representation learning is of great importance for drug molecular analysis. The development in molecular representation learning has demonstrated great promise through self-supervised pre-training strategy to overcome the scarcity of labeled molecular property data. Recent studies concentrate on pre-training molecular representation encoders by integrating both 2D topological and 3D geometric structures. However, existing methods rely on molecule-level or atom-level alignment for different views, while overlooking hierarchical self-supervised learning to capture both inter-molecule and intra-molecule correlation. Additionally, most methods employ 2D or 3D encoders to individually extract molecular characteristics locally or globally for molecular property prediction. The potential for effectively fusing these two molecular representations remains to be explored. In this work, we propose a Multi-View Molecular Representation Learning method (MVMRL) for molecular property prediction. First, hierarchical pre-training pretext tasks are designed, including fine-grained atom-level tasks for 2D molecular graphs as well as coarse-grained molecule-level tasks for 3D molecular graphs to provide complementary information to each other. Subsequently, a motif-level fusion pattern of multi-view molecular representations is presented during fine-tuning to enhance the performance of molecular property prediction. We evaluate the effectiveness of the proposed MVMRL by comparing with state-of-the-art baselines on molecular property prediction tasks, and the experimental results demonstrate the superiority of MVMRL.
期刊介绍:
IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.