Enhancing molecular representation via fusion of multimodal transformers with integrated periodic local and global features

IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Jia Ao, Xiangsheng Huang, Wei Dai, Cancan Ji
{"title":"Enhancing molecular representation via fusion of multimodal transformers with integrated periodic local and global features","authors":"Jia Ao,&nbsp;Xiangsheng Huang,&nbsp;Wei Dai,&nbsp;Cancan Ji","doi":"10.1007/s10822-025-00658-5","DOIUrl":null,"url":null,"abstract":"<div><p>Due to the complexity of molecules, molecular learning requires a large amount of molecular data. However, labeled data is typically limited, making self-supervised pretraining methods essential. Despite this, current pretraining methods often fail to sufficiently focus on both local and global molecular information. In this study, we propose a multi-modality self-supervised learning framework that simultaneously captures local and global information. Specifically, we encode SMILES sequences and molecular graphs separately and use a unified fusion approach to strengthen the interaction between the two modalities. Moreover, in the molecular graph encoding, we independently capture global and local information, and enhance the attention to bond features through information fusion. Additionally, we introduce the FA-FFN module to aggregate periodic features of the molecule. Experimental results show that MoleTGL exhibits superior performance compared to existing methods on seven classification tasks and six regression tasks related to molecular property prediction, and ablation studies confirm the effectiveness of local and global feature fusion and the superiority of the methods for acquiring local and global information.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-025-00658-5","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Due to the complexity of molecules, molecular learning requires a large amount of molecular data. However, labeled data is typically limited, making self-supervised pretraining methods essential. Despite this, current pretraining methods often fail to sufficiently focus on both local and global molecular information. In this study, we propose a multi-modality self-supervised learning framework that simultaneously captures local and global information. Specifically, we encode SMILES sequences and molecular graphs separately and use a unified fusion approach to strengthen the interaction between the two modalities. Moreover, in the molecular graph encoding, we independently capture global and local information, and enhance the attention to bond features through information fusion. Additionally, we introduce the FA-FFN module to aggregate periodic features of the molecule. Experimental results show that MoleTGL exhibits superior performance compared to existing methods on seven classification tasks and six regression tasks related to molecular property prediction, and ablation studies confirm the effectiveness of local and global feature fusion and the superiority of the methods for acquiring local and global information.

通过集成周期性局部和全局特征的多模态变压器融合增强分子表征
由于分子的复杂性,分子学习需要大量的分子数据。然而,标记数据通常是有限的,这使得自我监督的预训练方法必不可少。尽管如此,目前的预训练方法往往不能充分关注局部和全局分子信息。在这项研究中,我们提出了一个多模态的自我监督学习框架,同时捕获局部和全局信息。具体来说,我们将SMILES序列和分子图分开编码,并使用统一的融合方法来加强两种模式之间的相互作用。此外,在分子图编码中,我们独立捕获全局和局部信息,并通过信息融合增强对键特征的关注。此外,我们引入FA-FFN模块来聚合分子的周期性特征。实验结果表明,MoleTGL在分子性质预测相关的7个分类任务和6个回归任务上表现出优于现有方法的性能,消融研究证实了局部和全局特征融合的有效性以及局部和全局信息获取方法的优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computer-Aided Molecular Design
Journal of Computer-Aided Molecular Design 生物-计算机:跨学科应用
CiteScore
8.00
自引率
8.60%
发文量
56
审稿时长
3 months
期刊介绍: The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas: - theoretical chemistry; - computational chemistry; - computer and molecular graphics; - molecular modeling; - protein engineering; - drug design; - expert systems; - general structure-property relationships; - molecular dynamics; - chemical database development and usage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信