基于图和化学语言信息的多模态均相化学反应性能预测

IF 5.5 1区 化学 Q2 CHEMISTRY, MULTIDISCIPLINARY
Shen Wang, Weiren Zhao, Yining Liu, Yang Li
{"title":"基于图和化学语言信息的多模态均相化学反应性能预测","authors":"Shen Wang,&nbsp;Weiren Zhao,&nbsp;Yining Liu,&nbsp;Yang Li","doi":"10.1002/cjoc.202401186","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Accurate prediction for chemical reaction performance offers optimal direction for synthetic development. To this end, we present a novel multi-modal model called MMHRP-GCL to achieve the prediction of homogeneous chemical reaction yield, enantioselectivity, and activation energy by fusing the information from the text and graph modalities, requiring only 8 simple descriptors and Reaction SMILES obtained without high-cost DFT computation, and capable of managing reactions involving a fluctuating number of molecules. Experimental results on 4 datasets show that MMHRP-GCL outperforms at least 7 generalized SOTA methods. Ablation study confirms the critical roles of the complementation of graph and text modalities, as well as the significance of modality alignment and atomic features in prediction. Albeit there is still room for improvement in the interpretation of atomic relationships, the model has a remarkable ability to identify important atoms. A statistically interpretable study of the feature importance and a test on challenging dataset further demonstrates the utility and potential of the model. As a high-accuracy, low-cost, interpretable, and general multi-modal model, MMHRP-GCL provides valuable guidance on the design of forward predictors for homogeneous catalytic reactions.</p>\n <p>\n </p>\n </div>","PeriodicalId":151,"journal":{"name":"Chinese Journal of Chemistry","volume":"43 11","pages":"1230-1238"},"PeriodicalIF":5.5000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-modal Homogeneous Chemical Reaction Performance Prediction with Graph and Chemical Language Information\",\"authors\":\"Shen Wang,&nbsp;Weiren Zhao,&nbsp;Yining Liu,&nbsp;Yang Li\",\"doi\":\"10.1002/cjoc.202401186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Accurate prediction for chemical reaction performance offers optimal direction for synthetic development. To this end, we present a novel multi-modal model called MMHRP-GCL to achieve the prediction of homogeneous chemical reaction yield, enantioselectivity, and activation energy by fusing the information from the text and graph modalities, requiring only 8 simple descriptors and Reaction SMILES obtained without high-cost DFT computation, and capable of managing reactions involving a fluctuating number of molecules. Experimental results on 4 datasets show that MMHRP-GCL outperforms at least 7 generalized SOTA methods. Ablation study confirms the critical roles of the complementation of graph and text modalities, as well as the significance of modality alignment and atomic features in prediction. Albeit there is still room for improvement in the interpretation of atomic relationships, the model has a remarkable ability to identify important atoms. A statistically interpretable study of the feature importance and a test on challenging dataset further demonstrates the utility and potential of the model. As a high-accuracy, low-cost, interpretable, and general multi-modal model, MMHRP-GCL provides valuable guidance on the design of forward predictors for homogeneous catalytic reactions.</p>\\n <p>\\n </p>\\n </div>\",\"PeriodicalId\":151,\"journal\":{\"name\":\"Chinese Journal of Chemistry\",\"volume\":\"43 11\",\"pages\":\"1230-1238\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chinese Journal of Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cjoc.202401186\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Journal of Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cjoc.202401186","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

对化学反应性能的准确预测为合成的发展提供了最佳的方向。为此,我们提出了一种新的多模态模型MMHRP-GCL,通过融合文本和图形模态的信息来实现均匀化学反应产率、对映体选择性和活化能的预测,只需要8个简单的描述符和反应smile,不需要高成本的DFT计算,并且能够管理涉及分子数量波动的反应。在4个数据集上的实验结果表明,MMHRP-GCL至少优于7种广义SOTA方法。消融研究证实了图形和文本模态互补的关键作用,以及模态对齐和原子特征在预测中的意义。尽管对原子关系的解释仍有改进的余地,但该模型在识别重要原子方面具有非凡的能力。对特征重要性的统计解释研究和对具有挑战性的数据集的测试进一步证明了该模型的实用性和潜力。作为一种高精度、低成本、可解释和通用的多模态模型,MMHRP-GCL为均相催化反应正向预测器的设计提供了有价值的指导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-modal Homogeneous Chemical Reaction Performance Prediction with Graph and Chemical Language Information

Accurate prediction for chemical reaction performance offers optimal direction for synthetic development. To this end, we present a novel multi-modal model called MMHRP-GCL to achieve the prediction of homogeneous chemical reaction yield, enantioselectivity, and activation energy by fusing the information from the text and graph modalities, requiring only 8 simple descriptors and Reaction SMILES obtained without high-cost DFT computation, and capable of managing reactions involving a fluctuating number of molecules. Experimental results on 4 datasets show that MMHRP-GCL outperforms at least 7 generalized SOTA methods. Ablation study confirms the critical roles of the complementation of graph and text modalities, as well as the significance of modality alignment and atomic features in prediction. Albeit there is still room for improvement in the interpretation of atomic relationships, the model has a remarkable ability to identify important atoms. A statistically interpretable study of the feature importance and a test on challenging dataset further demonstrates the utility and potential of the model. As a high-accuracy, low-cost, interpretable, and general multi-modal model, MMHRP-GCL provides valuable guidance on the design of forward predictors for homogeneous catalytic reactions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Chinese Journal of Chemistry
Chinese Journal of Chemistry 化学-化学综合
CiteScore
8.80
自引率
14.80%
发文量
422
审稿时长
1.7 months
期刊介绍: The Chinese Journal of Chemistry is an international forum for peer-reviewed original research results in all fields of chemistry. Founded in 1983 under the name Acta Chimica Sinica English Edition and renamed in 1990 as Chinese Journal of Chemistry, the journal publishes a stimulating mixture of Accounts, Full Papers, Notes and Communications in English.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信