When good fits fail: assessing the reliability of machine learning models for PSA CH4/CO2 process optimization

IF 4.3 2区工程技术 Q2 ENGINEERING, CHEMICAL

Chemical Engineering Science Pub Date : 2025-10-05 DOI:10.1016/j.ces.2025.122746

Klaus F.S. Richard, Diana C.S. Azevedo, Moises Bastos-Neto

{"title":"When good fits fail: assessing the reliability of machine learning models for PSA CH4/CO2 process optimization","authors":"Klaus F.S. Richard, Diana C.S. Azevedo, Moises Bastos-Neto","doi":"10.1016/j.ces.2025.122746","DOIUrl":null,"url":null,"abstract":"<div><div>This work investigates the application of machine learning (ML) models for predicting and optimizing the performance parameters purity and recovery of a CH<sub>4</sub>/CO<sub>2</sub> separation process via Pressure Swing Adsorption (PSA). Several ML algorithms were trained and tested using datasets generated from a detailed phenomenological PSA model, and their optimization performance was benchmarked against a reference Pareto front obtained from the same detailed model. The study critically examines the reliability of common goodness-of-fit metrics from the training and testing phases as predictors of optimization accuracy. Results reveal that high fitting accuracy in the dataset does not guarantee accurate optimization outcomes, while models with comparatively poorer fitting during training may outperform more complex models in the optimization task. Furthermore, traditional global error metrics are shown to be insufficient predictors of optimization reliability, with segmented, range-based error analysis providing better insights. Despite these challenges, the Gradient Boosted Tree model delivered highly accurate Pareto fronts with a computational cost reduction of over 65% compared to the full phenomenological model. These findings underscore both the potential and the limitations of ML-assisted process optimization and highlight the need for more nuanced error evaluation strategies in surrogate-based optimization frameworks.</div></div>","PeriodicalId":271,"journal":{"name":"Chemical Engineering Science","volume":"321 ","pages":"Article 122746"},"PeriodicalIF":4.3000,"publicationDate":"2025-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemical Engineering Science","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0009250925015672","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}

引用次数: 0

Abstract

This work investigates the application of machine learning (ML) models for predicting and optimizing the performance parameters purity and recovery of a CH₄/CO₂ separation process via Pressure Swing Adsorption (PSA). Several ML algorithms were trained and tested using datasets generated from a detailed phenomenological PSA model, and their optimization performance was benchmarked against a reference Pareto front obtained from the same detailed model. The study critically examines the reliability of common goodness-of-fit metrics from the training and testing phases as predictors of optimization accuracy. Results reveal that high fitting accuracy in the dataset does not guarantee accurate optimization outcomes, while models with comparatively poorer fitting during training may outperform more complex models in the optimization task. Furthermore, traditional global error metrics are shown to be insufficient predictors of optimization reliability, with segmented, range-based error analysis providing better insights. Despite these challenges, the Gradient Boosted Tree model delivered highly accurate Pareto fronts with a computational cost reduction of over 65% compared to the full phenomenological model. These findings underscore both the potential and the limitations of ML-assisted process optimization and highlight the need for more nuanced error evaluation strategies in surrogate-based optimization frameworks.

Abstract Image

查看原文本刊更多论文

当良好拟合失败时：评估PSA CH4/CO2过程优化的机器学习模型的可靠性

本工作研究了机器学习（ML）模型在通过变压吸附（PSA）预测和优化CH4/CO2分离过程的性能参数纯度和回收率的应用。使用从详细现象学PSA模型生成的数据集训练和测试了几种ML算法，并将其优化性能与从相同详细模型获得的参考帕累托前沿进行了基准测试。该研究严格检查了从训练和测试阶段作为优化精度预测因子的常见拟合优度指标的可靠性。结果表明，数据集的高拟合精度并不能保证准确的优化结果，而训练时拟合相对较差的模型在优化任务中可能优于更复杂的模型。此外，传统的全局误差指标被证明不足以预测优化可靠性，分段的、基于范围的误差分析提供了更好的见解。尽管存在这些挑战，与完整的现象模型相比，梯度增强树模型提供了高度精确的帕累托前沿，计算成本降低了65%以上。这些发现强调了机器学习辅助过程优化的潜力和局限性，并强调了在基于代理的优化框架中需要更细致的错误评估策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Chemical Engineering Science 工程技术-工程：化工

CiteScore

7.50

自引率

8.50%

发文量

1025

审稿时长

50 days

期刊介绍： Chemical engineering enables the transformation of natural resources and energy into useful products for society. It draws on and applies natural sciences, mathematics and economics, and has developed fundamental engineering science that underpins the discipline. Chemical Engineering Science (CES) has been publishing papers on the fundamentals of chemical engineering since 1951. CES is the platform where the most significant advances in the discipline have ever since been published. Chemical Engineering Science has accompanied and sustained chemical engineering through its development into the vibrant and broad scientific discipline it is today.