Multimodal Representation Learning via Graph Isomorphism Network for Toxicity Multitask Learning

IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL
Guishen Wang, Hui Feng, Mengyan Du, Yuncong Feng, Chen Cao
{"title":"Multimodal Representation Learning via Graph Isomorphism Network for Toxicity Multitask Learning","authors":"Guishen Wang, Hui Feng, Mengyan Du, Yuncong Feng, Chen Cao","doi":"10.1021/acs.jcim.4c01061","DOIUrl":null,"url":null,"abstract":"Toxicity is paramount for comprehending compound properties, particularly in the early stages of drug design. Due to the diversity and complexity of toxic effects, it became a challenge to compute compound toxicity tasks. To address this issue, we propose a multimodal representation learning model, termed multimodal graph isomorphism network (MMGIN), to address this challenge for compound toxicity multitask learning. Based on fingerprints and molecular graphs of compounds, our MMGIN model incorporates a multimodal representation learning model to acquire a comprehensive compound representation. This model adopts a two-channel structure to independently learn fingerprint representation and molecular graph representation. Subsequently, two feedforward neural networks utilize the learned multimodal compound representation to perform multitask learning, encompassing compound toxicity classification and multiple compound category classification simultaneously. To test the effectiveness of our model, we constructed a novel data set, termed the compound toxicity multitask learning (CTMTL) data set, derived from the TOXRIC data set. We compare our MMGIN model with other representative machine learning and deep learning models on the CTMTL and Tox21 data sets. The experimental results demonstrate significant advancements achieved by our MMGIN model. Furthermore, the ablation study underscores the effectiveness of the introduced fingerprints, molecular graphs, the multimodal representation learning model, and the multitask learning model, showcasing the model’s superior predictive capability and robustness.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.4c01061","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0

Abstract

Toxicity is paramount for comprehending compound properties, particularly in the early stages of drug design. Due to the diversity and complexity of toxic effects, it became a challenge to compute compound toxicity tasks. To address this issue, we propose a multimodal representation learning model, termed multimodal graph isomorphism network (MMGIN), to address this challenge for compound toxicity multitask learning. Based on fingerprints and molecular graphs of compounds, our MMGIN model incorporates a multimodal representation learning model to acquire a comprehensive compound representation. This model adopts a two-channel structure to independently learn fingerprint representation and molecular graph representation. Subsequently, two feedforward neural networks utilize the learned multimodal compound representation to perform multitask learning, encompassing compound toxicity classification and multiple compound category classification simultaneously. To test the effectiveness of our model, we constructed a novel data set, termed the compound toxicity multitask learning (CTMTL) data set, derived from the TOXRIC data set. We compare our MMGIN model with other representative machine learning and deep learning models on the CTMTL and Tox21 data sets. The experimental results demonstrate significant advancements achieved by our MMGIN model. Furthermore, the ablation study underscores the effectiveness of the introduced fingerprints, molecular graphs, the multimodal representation learning model, and the multitask learning model, showcasing the model’s superior predictive capability and robustness.

Abstract Image

通过图同构网络进行多模态表征学习,实现毒性多任务学习
毒性对于理解化合物特性至关重要,尤其是在药物设计的早期阶段。由于毒性效应的多样性和复杂性,计算化合物毒性任务成为一项挑战。为了解决这个问题,我们提出了一种多模态表征学习模型,称为多模态图同构网络(MMGIN),以应对化合物毒性多任务学习的挑战。基于化合物的指纹和分子图谱,我们的 MMGIN 模型结合了多模态表征学习模型,以获得全面的化合物表征。该模型采用双通道结构,独立学习指纹表征和分子图表征。随后,两个前馈神经网络利用学习到的多模态化合物表征执行多任务学习,同时进行化合物毒性分类和多种化合物类别分类。为了测试模型的有效性,我们构建了一个新的数据集,称为化合物毒性多任务学习(CTMTL)数据集,该数据集来自 TOXRIC 数据集。在 CTMTL 和 Tox21 数据集上,我们将 MMGIN 模型与其他具有代表性的机器学习和深度学习模型进行了比较。实验结果表明,我们的 MMGIN 模型取得了重大进步。此外,消融研究强调了引入的指纹、分子图谱、多模态表征学习模型和多任务学习模型的有效性,展示了该模型卓越的预测能力和鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
9.80
自引率
10.70%
发文量
529
审稿时长
1.4 months
期刊介绍: The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信