Ilhem Bouaziz , Mohamed Hentabli , Mohamed Kouider Amar , Maamar Laidi , Amel Bouzidi , Hakim Bouzemlal , Ahmed Chabane , Abdeltif Amrane , Salah Hanini
{"title":"深度学习和机器学习模型与增强特征提取的应用在超临界CO2植物提取产量预测中的应用:优化和比较分析","authors":"Ilhem Bouaziz , Mohamed Hentabli , Mohamed Kouider Amar , Maamar Laidi , Amel Bouzidi , Hakim Bouzemlal , Ahmed Chabane , Abdeltif Amrane , Salah Hanini","doi":"10.1016/j.supflu.2025.106755","DOIUrl":null,"url":null,"abstract":"<div><div>The efficient extraction of essential oils (EOs), particularly volatile compounds, from medicinal, aromatic, or oil-rich crop plants using supercritical carbon dioxide extraction (scCO<sub>2</sub>) is crucial for industries such as pharmaceuticals, cosmetics, and food. However, optimizing this process presents challenges due to the intricate molecular diversity of the compounds and the complex interplay of scCO<sub>2</sub> parameters. To address these limitations, this study introduces a hybrid predictive framework that combines deep learning and machine learning, utilizing 694 scCO<sub>2</sub> experimental data points sourced from the literature across 21 plant species. Four major molecular compounds per plant were selected as input features, alongside key process parameters, including temperature, pressure, extraction time, co-solvent ratio, and CO<sub>2</sub> flow rate. Morgan fingerprints were computed for these compounds, and a convolutional neural network (CNN) was utilized to extract their high-level representations into compact vectors. These vectors were integrated with normalized process parameters and fed into a CNN-Multilayer Perceptron (CNN-MLP) hybrid architecture. Performance was compared with Support Vector Regression (SVR), Random Forest (RF), Gaussian Process Regression (GPR), and XGBoost, all optimized using OPTUNA. The CNN-MLP achieved the best performance, with an R<sup>2</sup> of 0.974 and a Root Mean Squared Error (RMSE) of 1.431 on the test set. A paired t-test (p = 0.810) and Bland–Altman analysis (mean difference: 9.35 %) confirmed the model's robustness. To further assess generalizability, external validations were conducted using unseen experimental conditions. The CNN-MLP was tested on three extraction profiles and demonstrated strong predictive performance, with Pearson correlations ranging from 0.95 to 0.98.</div></div>","PeriodicalId":17078,"journal":{"name":"Journal of Supercritical Fluids","volume":"227 ","pages":"Article 106755"},"PeriodicalIF":4.4000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Application of deep learning and machine learning models with enhanced feature extraction for the prediction of plant extraction yields using supercritical CO2: An optimization and comparative analysis\",\"authors\":\"Ilhem Bouaziz , Mohamed Hentabli , Mohamed Kouider Amar , Maamar Laidi , Amel Bouzidi , Hakim Bouzemlal , Ahmed Chabane , Abdeltif Amrane , Salah Hanini\",\"doi\":\"10.1016/j.supflu.2025.106755\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The efficient extraction of essential oils (EOs), particularly volatile compounds, from medicinal, aromatic, or oil-rich crop plants using supercritical carbon dioxide extraction (scCO<sub>2</sub>) is crucial for industries such as pharmaceuticals, cosmetics, and food. However, optimizing this process presents challenges due to the intricate molecular diversity of the compounds and the complex interplay of scCO<sub>2</sub> parameters. To address these limitations, this study introduces a hybrid predictive framework that combines deep learning and machine learning, utilizing 694 scCO<sub>2</sub> experimental data points sourced from the literature across 21 plant species. Four major molecular compounds per plant were selected as input features, alongside key process parameters, including temperature, pressure, extraction time, co-solvent ratio, and CO<sub>2</sub> flow rate. Morgan fingerprints were computed for these compounds, and a convolutional neural network (CNN) was utilized to extract their high-level representations into compact vectors. These vectors were integrated with normalized process parameters and fed into a CNN-Multilayer Perceptron (CNN-MLP) hybrid architecture. Performance was compared with Support Vector Regression (SVR), Random Forest (RF), Gaussian Process Regression (GPR), and XGBoost, all optimized using OPTUNA. The CNN-MLP achieved the best performance, with an R<sup>2</sup> of 0.974 and a Root Mean Squared Error (RMSE) of 1.431 on the test set. A paired t-test (p = 0.810) and Bland–Altman analysis (mean difference: 9.35 %) confirmed the model's robustness. To further assess generalizability, external validations were conducted using unseen experimental conditions. The CNN-MLP was tested on three extraction profiles and demonstrated strong predictive performance, with Pearson correlations ranging from 0.95 to 0.98.</div></div>\",\"PeriodicalId\":17078,\"journal\":{\"name\":\"Journal of Supercritical Fluids\",\"volume\":\"227 \",\"pages\":\"Article 106755\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Supercritical Fluids\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0896844625002426\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercritical Fluids","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0896844625002426","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
Application of deep learning and machine learning models with enhanced feature extraction for the prediction of plant extraction yields using supercritical CO2: An optimization and comparative analysis
The efficient extraction of essential oils (EOs), particularly volatile compounds, from medicinal, aromatic, or oil-rich crop plants using supercritical carbon dioxide extraction (scCO2) is crucial for industries such as pharmaceuticals, cosmetics, and food. However, optimizing this process presents challenges due to the intricate molecular diversity of the compounds and the complex interplay of scCO2 parameters. To address these limitations, this study introduces a hybrid predictive framework that combines deep learning and machine learning, utilizing 694 scCO2 experimental data points sourced from the literature across 21 plant species. Four major molecular compounds per plant were selected as input features, alongside key process parameters, including temperature, pressure, extraction time, co-solvent ratio, and CO2 flow rate. Morgan fingerprints were computed for these compounds, and a convolutional neural network (CNN) was utilized to extract their high-level representations into compact vectors. These vectors were integrated with normalized process parameters and fed into a CNN-Multilayer Perceptron (CNN-MLP) hybrid architecture. Performance was compared with Support Vector Regression (SVR), Random Forest (RF), Gaussian Process Regression (GPR), and XGBoost, all optimized using OPTUNA. The CNN-MLP achieved the best performance, with an R2 of 0.974 and a Root Mean Squared Error (RMSE) of 1.431 on the test set. A paired t-test (p = 0.810) and Bland–Altman analysis (mean difference: 9.35 %) confirmed the model's robustness. To further assess generalizability, external validations were conducted using unseen experimental conditions. The CNN-MLP was tested on three extraction profiles and demonstrated strong predictive performance, with Pearson correlations ranging from 0.95 to 0.98.
期刊介绍:
The Journal of Supercritical Fluids is an international journal devoted to the fundamental and applied aspects of supercritical fluids and processes. Its aim is to provide a focused platform for academic and industrial researchers to report their findings and to have ready access to the advances in this rapidly growing field. Its coverage is multidisciplinary and includes both basic and applied topics.
Thermodynamics and phase equilibria, reaction kinetics and rate processes, thermal and transport properties, and all topics related to processing such as separations (extraction, fractionation, purification, chromatography) nucleation and impregnation are within the scope. Accounts of specific engineering applications such as those encountered in food, fuel, natural products, minerals, pharmaceuticals and polymer industries are included. Topics related to high pressure equipment design, analytical techniques, sensors, and process control methodologies are also within the scope of the journal.