基于多模态融合和解耦的分子表征学习

IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xuan Zang , Junjie Zhang , Buzhou Tang
{"title":"基于多模态融合和解耦的分子表征学习","authors":"Xuan Zang ,&nbsp;Junjie Zhang ,&nbsp;Buzhou Tang","doi":"10.1016/j.inffus.2025.103493","DOIUrl":null,"url":null,"abstract":"<div><div>Recent years have seen growing attention on self-supervised learning in drug molecule research and discovery. Additionally, a series of methods have emerged that leverage both 2D and 3D structures for molecular representation learning. However, these methods focus only on the modal consistency between 2D and 3D molecular structure relying on molecule-level or atom-level alignment while ignoring modal complementarity. In this paper, we propose a multimodal fusion-then-decoupling self-supervised molecular representation learning method named MolMFD. First, we use a unified encoder to fuse 2D and 3D molecular structural information by incorporating atomic relative distances from both topological and geometric views. Then, we design a learnable noise injection strategy to decouple modality-specific representations, which are subsequently input into separate decoders to predict the structural information of each corresponding modality. Notably, we minimize mutual information to extract the 2D and 3D modality-specific characteristics, considering modality complementarity to enrich the fused molecular representations. We provide a theoretical analysis of the optimization issues and the overlooked complementarity problems in existing 2D and 3D multimodal molecular pre-training methods. Extensive molecular prediction experiments validate the effectiveness and superiority of our proposed MolMFD.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"125 ","pages":"Article 103493"},"PeriodicalIF":15.5000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Molecular representation learning via multimodal fusion and decoupling\",\"authors\":\"Xuan Zang ,&nbsp;Junjie Zhang ,&nbsp;Buzhou Tang\",\"doi\":\"10.1016/j.inffus.2025.103493\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent years have seen growing attention on self-supervised learning in drug molecule research and discovery. Additionally, a series of methods have emerged that leverage both 2D and 3D structures for molecular representation learning. However, these methods focus only on the modal consistency between 2D and 3D molecular structure relying on molecule-level or atom-level alignment while ignoring modal complementarity. In this paper, we propose a multimodal fusion-then-decoupling self-supervised molecular representation learning method named MolMFD. First, we use a unified encoder to fuse 2D and 3D molecular structural information by incorporating atomic relative distances from both topological and geometric views. Then, we design a learnable noise injection strategy to decouple modality-specific representations, which are subsequently input into separate decoders to predict the structural information of each corresponding modality. Notably, we minimize mutual information to extract the 2D and 3D modality-specific characteristics, considering modality complementarity to enrich the fused molecular representations. We provide a theoretical analysis of the optimization issues and the overlooked complementarity problems in existing 2D and 3D multimodal molecular pre-training methods. Extensive molecular prediction experiments validate the effectiveness and superiority of our proposed MolMFD.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"125 \",\"pages\":\"Article 103493\"},\"PeriodicalIF\":15.5000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253525005664\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525005664","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

近年来,自监督学习在药物分子研究和发现中的应用受到越来越多的关注。此外,已经出现了一系列利用二维和三维结构进行分子表征学习的方法。然而,这些方法只关注二维和三维分子结构之间的模态一致性,依赖于分子水平或原子水平的排列,而忽略了模态互补性。本文提出了一种多模态融合-解耦自监督分子表示学习方法MolMFD。首先,我们使用一个统一的编码器融合二维和三维分子结构信息,结合原子相对距离从拓扑和几何视图。然后,我们设计了一个可学习的噪声注入策略来解耦特定于模态的表示,这些表示随后被输入到单独的解码器中,以预测每个相应模态的结构信息。值得注意的是,我们最小化互信息以提取2D和3D模态特定特征,并考虑模态互补性以丰富融合分子表征。我们对现有二维和三维多模态分子预训练方法中的优化问题和被忽视的互补性问题进行了理论分析。大量的分子预测实验验证了我们所提出的MolMFD的有效性和优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Molecular representation learning via multimodal fusion and decoupling

Molecular representation learning via multimodal fusion and decoupling
Recent years have seen growing attention on self-supervised learning in drug molecule research and discovery. Additionally, a series of methods have emerged that leverage both 2D and 3D structures for molecular representation learning. However, these methods focus only on the modal consistency between 2D and 3D molecular structure relying on molecule-level or atom-level alignment while ignoring modal complementarity. In this paper, we propose a multimodal fusion-then-decoupling self-supervised molecular representation learning method named MolMFD. First, we use a unified encoder to fuse 2D and 3D molecular structural information by incorporating atomic relative distances from both topological and geometric views. Then, we design a learnable noise injection strategy to decouple modality-specific representations, which are subsequently input into separate decoders to predict the structural information of each corresponding modality. Notably, we minimize mutual information to extract the 2D and 3D modality-specific characteristics, considering modality complementarity to enrich the fused molecular representations. We provide a theoretical analysis of the optimization issues and the overlooked complementarity problems in existing 2D and 3D multimodal molecular pre-training methods. Extensive molecular prediction experiments validate the effectiveness and superiority of our proposed MolMFD.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信