生成对抗网络增强的异构集成学习用于co2 -甲醇催化剂性能的可解释预测

IF 3.9 3区工程技术 Q2 ENGINEERING, CHEMICAL

Industrial & Engineering Chemistry Research Pub Date : 2025-10-06 DOI:10.1021/acs.iecr.5c02507

Qingchun Yang, , , Dongwen Rong, , , Zhao Wang, , , Qiwen Guo, , , Jingsong Guan, , , Yichun Dong, , and , Huairong Zhou*,

{"title":"生成对抗网络增强的异构集成学习用于co2 -甲醇催化剂性能的可解释预测","authors":"Qingchun Yang, , , Dongwen Rong, , , Zhao Wang, , , Qiwen Guo, , , Jingsong Guan, , , Yichun Dong, , and , Huairong Zhou*, ","doi":"10.1021/acs.iecr.5c02507","DOIUrl":null,"url":null,"abstract":"Accurate prediction of catalyst performance in CO2-to-methanol (CTM) conversion remains challenging due to the scarcity of experimental data and the complexity of feature interactions. To address these issues, this study proposes an interpretable gene-adversarial network-enhanced heterogeneous ensemble modeling (GAN-HEM) framework. It integrates four synergistic key functional modules: data augmentation, feature interaction analysis, ensemble modeling, and interpretable analysis. Comparing the variational autoencoder-based data augmentation method demonstrates that the GAN has significant advantages in maintaining the global consistency and local geometric features of the data manifold structure. After the multivariate evaluation of feature association within the hybrid data set, it was found that retaining all the identified CTM catalyst features is beneficial for enhancing the predictive accuracy and generalization ability of the prediction model. Therefore, this data set is further applied to develop various homogeneous and heterogeneous ensemble learning models of the CTM process. The hyperparameters of these models are automatically optimized using the Bayesian algorithm-based Optuna approach. Results indicated that the optimized heterogeneous ensemble learning architecture has the highest prediction accuracy (R2 = 0.9314 and RMSE = 0.2636), significantly outperforming homogeneous models through its capacity to capture complex nonlinear feature interactions. The Shapley additive explanation-based interpretability analysis identifies reaction temperature as the dominant feature (>39% contribution). Partial dependence plots reveal competitive selectivity: higher temperature favors CO (thermodynamic constraints), while increased pressure and heating rate enhance methanol selectivity (kinetic promotion). This framework accelerates CTM catalyst discovery and optimization, providing high-fidelity prediction and actionable design insights.","PeriodicalId":39,"journal":{"name":"Industrial & Engineering Chemistry Research","volume":"64 41","pages":"19797–19816"},"PeriodicalIF":3.9000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generative Adversarial Network-Enhanced Heterogeneous Ensemble Learning for Interpretable Prediction of CO2-to-Methanol Catalyst Performance\",\"authors\":\"Qingchun Yang, , , Dongwen Rong, , , Zhao Wang, , , Qiwen Guo, , , Jingsong Guan, , , Yichun Dong, , and , Huairong Zhou*, \",\"doi\":\"10.1021/acs.iecr.5c02507\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate prediction of catalyst performance in CO2-to-methanol (CTM) conversion remains challenging due to the scarcity of experimental data and the complexity of feature interactions. To address these issues, this study proposes an interpretable gene-adversarial network-enhanced heterogeneous ensemble modeling (GAN-HEM) framework. It integrates four synergistic key functional modules: data augmentation, feature interaction analysis, ensemble modeling, and interpretable analysis. Comparing the variational autoencoder-based data augmentation method demonstrates that the GAN has significant advantages in maintaining the global consistency and local geometric features of the data manifold structure. After the multivariate evaluation of feature association within the hybrid data set, it was found that retaining all the identified CTM catalyst features is beneficial for enhancing the predictive accuracy and generalization ability of the prediction model. Therefore, this data set is further applied to develop various homogeneous and heterogeneous ensemble learning models of the CTM process. The hyperparameters of these models are automatically optimized using the Bayesian algorithm-based Optuna approach. Results indicated that the optimized heterogeneous ensemble learning architecture has the highest prediction accuracy (R2 = 0.9314 and RMSE = 0.2636), significantly outperforming homogeneous models through its capacity to capture complex nonlinear feature interactions. The Shapley additive explanation-based interpretability analysis identifies reaction temperature as the dominant feature (>39% contribution). Partial dependence plots reveal competitive selectivity: higher temperature favors CO (thermodynamic constraints), while increased pressure and heating rate enhance methanol selectivity (kinetic promotion). This framework accelerates CTM catalyst discovery and optimization, providing high-fidelity prediction and actionable design insights.\",\"PeriodicalId\":39,\"journal\":{\"name\":\"Industrial & Engineering Chemistry Research\",\"volume\":\"64 41\",\"pages\":\"19797–19816\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Industrial & Engineering Chemistry Research\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.iecr.5c02507\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Industrial & Engineering Chemistry Research","FirstCategoryId":"5","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.iecr.5c02507","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}

引用次数: 0

摘要

由于实验数据的缺乏和特征相互作用的复杂性，准确预测co2 -to-甲醇（CTM）转化过程中催化剂的性能仍然具有挑战性。为了解决这些问题，本研究提出了一个可解释的基因对抗网络增强异构集成建模（GAN-HEM）框架。它集成了四个协同的关键功能模块：数据增强、特征交互分析、集成建模和可解释分析。通过与基于变分自编码器的数据增强方法的比较，表明GAN在保持数据流形结构的全局一致性和局部几何特征方面具有显著的优势。对混合数据集中的特征关联进行多变量评价后，发现保留所有已识别的CTM催化剂特征有利于提高预测模型的预测精度和泛化能力。因此，该数据集进一步应用于开发CTM过程的各种同构和异构集成学习模型。利用基于贝叶斯算法的Optuna方法对这些模型的超参数进行自动优化。结果表明，优化后的异构集成学习体系结构具有最高的预测精度（R2 = 0.9314, RMSE = 0.2636），通过捕获复杂非线性特征相互作用的能力显著优于同构模型。基于Shapley加性解释的可解释性分析认为反应温度是主要特征（贡献39%）。部分依赖图揭示了竞争选择性：较高的温度有利于CO（热力学约束），而增加的压力和加热速率增强甲醇选择性（动力学促进）。该框架加速CTM催化剂的发现和优化，提供高保真的预测和可操作的设计见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Generative Adversarial Network-Enhanced Heterogeneous Ensemble Learning for Interpretable Prediction of CO2-to-Methanol Catalyst Performance

查看原文本刊更多论文

Generative Adversarial Network-Enhanced Heterogeneous Ensemble Learning for Interpretable Prediction of CO2-to-Methanol Catalyst Performance

Accurate prediction of catalyst performance in CO₂-to-methanol (CTM) conversion remains challenging due to the scarcity of experimental data and the complexity of feature interactions. To address these issues, this study proposes an interpretable gene-adversarial network-enhanced heterogeneous ensemble modeling (GAN-HEM) framework. It integrates four synergistic key functional modules: data augmentation, feature interaction analysis, ensemble modeling, and interpretable analysis. Comparing the variational autoencoder-based data augmentation method demonstrates that the GAN has significant advantages in maintaining the global consistency and local geometric features of the data manifold structure. After the multivariate evaluation of feature association within the hybrid data set, it was found that retaining all the identified CTM catalyst features is beneficial for enhancing the predictive accuracy and generalization ability of the prediction model. Therefore, this data set is further applied to develop various homogeneous and heterogeneous ensemble learning models of the CTM process. The hyperparameters of these models are automatically optimized using the Bayesian algorithm-based Optuna approach. Results indicated that the optimized heterogeneous ensemble learning architecture has the highest prediction accuracy (R² = 0.9314 and RMSE = 0.2636), significantly outperforming homogeneous models through its capacity to capture complex nonlinear feature interactions. The Shapley additive explanation-based interpretability analysis identifies reaction temperature as the dominant feature (>39% contribution). Partial dependence plots reveal competitive selectivity: higher temperature favors CO (thermodynamic constraints), while increased pressure and heating rate enhance methanol selectivity (kinetic promotion). This framework accelerates CTM catalyst discovery and optimization, providing high-fidelity prediction and actionable design insights.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Industrial & Engineering Chemistry Research 工程技术-工程：化工

CiteScore

7.40

自引率

7.10%

发文量

1467

审稿时长

2.8 months

期刊介绍： ndustrial & Engineering Chemistry, with variations in title and format, has been published since 1909 by the American Chemical Society. Industrial & Engineering Chemistry Research is a weekly publication that reports industrial and academic research in the broad fields of applied chemistry and chemical engineering with special focus on fundamentals, processes, and products.