{"title":"A KAN-based interpretable framework for prediction of global warming potential across chemical space","authors":"Jaewook Lee, Xinyang Sun , Ethan Errington , Calum Drysdale, Miao Guo","doi":"10.1016/j.ccst.2025.100478","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate yet interpretable prediction of Global Warming Potential (GWP) is essential for the sustainable design of novel molecules, chemical processes and materials. This capability is valuable in the early-stage screening of compounds with potential relevance to carbon management and emerging CCUS applications. However, conventional models often face a trade-off between predictive accuracy and interpretability. In this study, we propose an AI-based GWP prediction framework that integrates both molecular and process-level features to improve accuracy while employing white-box modeling techniques to enhance interpretability. First, by incorporating molecular descriptors (MACCS keys, Mordred descriptors) and process-level information (process title, description, location), the Deep Neural Network (DNN) model achieved an R² of 86 % on the test data, representing a 25 % improvement over the most comparable benchmark reported in prior studies. XAI analysis further highlights the crucial role of process-related features, particularly process title embeddings, in enhancing model predictions. Second, to address the need for model transparency, we employed a Kolmogorov–Arnold Network (KAN) model to develop a symbolic, white-box GWP prediction model. While achieving a lower R² of 64 %, this model provides explicit mathematical representations of GWP relationships, enabling interpretable decision-making in sustainable chemical and process design. Our findings demonstrate that integrating molecular and process-level features improves both predictive accuracy and interpretability in GWP modelling. The resulting framework can support early-stage environmental assessment of novel compounds, offering a useful tool to inform the sustainable design of chemicals, including those with potential applications in CCUS.</div></div>","PeriodicalId":9387,"journal":{"name":"Carbon Capture Science & Technology","volume":"16 ","pages":"Article 100478"},"PeriodicalIF":0.0000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Carbon Capture Science & Technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772656825001174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate yet interpretable prediction of Global Warming Potential (GWP) is essential for the sustainable design of novel molecules, chemical processes and materials. This capability is valuable in the early-stage screening of compounds with potential relevance to carbon management and emerging CCUS applications. However, conventional models often face a trade-off between predictive accuracy and interpretability. In this study, we propose an AI-based GWP prediction framework that integrates both molecular and process-level features to improve accuracy while employing white-box modeling techniques to enhance interpretability. First, by incorporating molecular descriptors (MACCS keys, Mordred descriptors) and process-level information (process title, description, location), the Deep Neural Network (DNN) model achieved an R² of 86 % on the test data, representing a 25 % improvement over the most comparable benchmark reported in prior studies. XAI analysis further highlights the crucial role of process-related features, particularly process title embeddings, in enhancing model predictions. Second, to address the need for model transparency, we employed a Kolmogorov–Arnold Network (KAN) model to develop a symbolic, white-box GWP prediction model. While achieving a lower R² of 64 %, this model provides explicit mathematical representations of GWP relationships, enabling interpretable decision-making in sustainable chemical and process design. Our findings demonstrate that integrating molecular and process-level features improves both predictive accuracy and interpretability in GWP modelling. The resulting framework can support early-stage environmental assessment of novel compounds, offering a useful tool to inform the sustainable design of chemicals, including those with potential applications in CCUS.