A KAN-based interpretable framework for prediction of global warming potential across chemical space

Jaewook Lee, Xinyang Sun , Ethan Errington , Calum Drysdale, Miao Guo
{"title":"A KAN-based interpretable framework for prediction of global warming potential across chemical space","authors":"Jaewook Lee,&nbsp;Xinyang Sun ,&nbsp;Ethan Errington ,&nbsp;Calum Drysdale,&nbsp;Miao Guo","doi":"10.1016/j.ccst.2025.100478","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate yet interpretable prediction of Global Warming Potential (GWP) is essential for the sustainable design of novel molecules, chemical processes and materials. This capability is valuable in the early-stage screening of compounds with potential relevance to carbon management and emerging CCUS applications. However, conventional models often face a trade-off between predictive accuracy and interpretability. In this study, we propose an AI-based GWP prediction framework that integrates both molecular and process-level features to improve accuracy while employing white-box modeling techniques to enhance interpretability. First, by incorporating molecular descriptors (MACCS keys, Mordred descriptors) and process-level information (process title, description, location), the Deep Neural Network (DNN) model achieved an R² of 86 % on the test data, representing a 25 % improvement over the most comparable benchmark reported in prior studies. XAI analysis further highlights the crucial role of process-related features, particularly process title embeddings, in enhancing model predictions. Second, to address the need for model transparency, we employed a Kolmogorov–Arnold Network (KAN) model to develop a symbolic, white-box GWP prediction model. While achieving a lower R² of 64 %, this model provides explicit mathematical representations of GWP relationships, enabling interpretable decision-making in sustainable chemical and process design. Our findings demonstrate that integrating molecular and process-level features improves both predictive accuracy and interpretability in GWP modelling. The resulting framework can support early-stage environmental assessment of novel compounds, offering a useful tool to inform the sustainable design of chemicals, including those with potential applications in CCUS.</div></div>","PeriodicalId":9387,"journal":{"name":"Carbon Capture Science & Technology","volume":"16 ","pages":"Article 100478"},"PeriodicalIF":0.0000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Carbon Capture Science & Technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772656825001174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate yet interpretable prediction of Global Warming Potential (GWP) is essential for the sustainable design of novel molecules, chemical processes and materials. This capability is valuable in the early-stage screening of compounds with potential relevance to carbon management and emerging CCUS applications. However, conventional models often face a trade-off between predictive accuracy and interpretability. In this study, we propose an AI-based GWP prediction framework that integrates both molecular and process-level features to improve accuracy while employing white-box modeling techniques to enhance interpretability. First, by incorporating molecular descriptors (MACCS keys, Mordred descriptors) and process-level information (process title, description, location), the Deep Neural Network (DNN) model achieved an R² of 86 % on the test data, representing a 25 % improvement over the most comparable benchmark reported in prior studies. XAI analysis further highlights the crucial role of process-related features, particularly process title embeddings, in enhancing model predictions. Second, to address the need for model transparency, we employed a Kolmogorov–Arnold Network (KAN) model to develop a symbolic, white-box GWP prediction model. While achieving a lower R² of 64 %, this model provides explicit mathematical representations of GWP relationships, enabling interpretable decision-making in sustainable chemical and process design. Our findings demonstrate that integrating molecular and process-level features improves both predictive accuracy and interpretability in GWP modelling. The resulting framework can support early-stage environmental assessment of novel compounds, offering a useful tool to inform the sustainable design of chemicals, including those with potential applications in CCUS.
基于kan1的化学空间全球变暖潜势预测可解释框架
准确而可解释的全球变暖潜势(GWP)预测对于新分子、化学过程和材料的可持续设计至关重要。这种能力在早期筛选与碳管理和新兴CCUS应用潜在相关的化合物时很有价值。然而,传统模型经常面临预测准确性和可解释性之间的权衡。在这项研究中,我们提出了一个基于人工智能的GWP预测框架,该框架集成了分子和过程水平的特征,以提高准确性,同时采用白盒建模技术来增强可解释性。首先,通过结合分子描述符(MACCS键、Mordred描述符)和过程级信息(过程标题、描述、位置),深度神经网络(DNN)模型在测试数据上实现了86%的R²,比之前研究中报告的最可比基准提高了25%。XAI分析进一步强调了过程相关特征(特别是过程标题嵌入)在增强模型预测方面的关键作用。其次,为了解决模型透明度的需要,我们采用了Kolmogorov-Arnold网络(KAN)模型来开发一个象征性的白盒GWP预测模型。在实现较低的R²(64%)的同时,该模型提供了GWP关系的明确数学表示,使可持续化学和工艺设计的决策具有可解释性。我们的研究结果表明,整合分子和过程水平的特征可以提高全球升温潜能值模型的预测准确性和可解释性。由此产生的框架可以支持新化合物的早期环境评估,为化学品的可持续设计提供有用的工具,包括那些在CCUS中有潜在应用的化学品。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信