利用机器学习模型的叠加回归集增强分子印迹聚合物电化学传感器的预测性能

IF 8.2 1区化学 Q1 CHEMISTRY, ANALYTICAL

ACS Sensors Pub Date : 2025-04-17 DOI:10.1021/acssensors.5c00364

Reza Mohammadi Dashtaki, Saeed Mohammadi Dashtaki, Esmaeil Heydari-Bafrooei, Md Jalil Piran

{"title":"利用机器学习模型的叠加回归集增强分子印迹聚合物电化学传感器的预测性能","authors":"Reza Mohammadi Dashtaki, Saeed Mohammadi Dashtaki, Esmaeil Heydari-Bafrooei, Md Jalil Piran","doi":"10.1021/acssensors.5c00364","DOIUrl":null,"url":null,"abstract":"The performance of electrochemical sensors is influenced by various factors. To enhance the effectiveness of these sensors, it is crucial to find the right balance among these factors. Researchers and engineers continually explore innovative approaches to enhance sensitivity, selectivity, and reliability. Machine learning (ML) techniques facilitate the analysis and predictive modeling of sensor performance by establishing quantitative relationships between parameters and their effects. This work presents a case study on developing a molecularly imprinted polymer (MIP)-based sensor for detecting doxorubicin (Dox), emphasizing the use of ML-based ensemble models to improve performance and reliability. Four ML models, including Decision Tree (DT), eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and K-Nearest Neighbors (KNN), are used to evaluate the effect of each parameter on prediction performance, using the SHapley Additive exPlanations (SHAP) method to determine feature importance. Based on the analysis, removing a less influential feature and introducing a new feature significantly improved the model’s predictive capabilities. By applying the min–max scaling technique, it is ensured that all features contribute proportionally to the model learning process. Additionally, multiple ML models─Linear Regression (LR), KNN, DT, RF, Adaptive Boosting (AdaBoost), Gradient Boosting (GB), Support Vector Regression (SVR), XGBoost, Bagging, Partial Least Squares (PLS), and Ridge Regression─are applied to the data set and their performance in predicting the sensor output current is compared. To further enhance prediction performance, a novel ensemble model is proposed that integrates DT, RF, GB, XGBoost, and Bagging regressors, leveraging their combined strengths to offset individual weaknesses. The main benefit of this work lies in its ability to enhance MIP-based sensor performance by developing a novel stacking regressor ensemble model, which improves prediction performance and reliability. This methodology is broadly applicable to the development of other sensors with different transducers and sensing elements. Through extensive simulation results, the proposed stacking regressor ensemble model demonstrated superior predictive performance compared to individual ML models. The model achieved an R-squared (R2) of 0.993, significantly reducing the root-mean-square error (RMSE) to 0.436 and the mean absolute error (MAE) to 0.244. These improvements enhanced sensitivity and reliability of the MIP-based electrochemical sensor, demonstrating a substantial performance gain over individual ML models.","PeriodicalId":24,"journal":{"name":"ACS Sensors","volume":"24 1","pages":""},"PeriodicalIF":8.2000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing the Predictive Performance of Molecularly Imprinted Polymer-Based Electrochemical Sensors Using a Stacking Regressor Ensemble of Machine Learning Models\",\"authors\":\"Reza Mohammadi Dashtaki, Saeed Mohammadi Dashtaki, Esmaeil Heydari-Bafrooei, Md Jalil Piran\",\"doi\":\"10.1021/acssensors.5c00364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of electrochemical sensors is influenced by various factors. To enhance the effectiveness of these sensors, it is crucial to find the right balance among these factors. Researchers and engineers continually explore innovative approaches to enhance sensitivity, selectivity, and reliability. Machine learning (ML) techniques facilitate the analysis and predictive modeling of sensor performance by establishing quantitative relationships between parameters and their effects. This work presents a case study on developing a molecularly imprinted polymer (MIP)-based sensor for detecting doxorubicin (Dox), emphasizing the use of ML-based ensemble models to improve performance and reliability. Four ML models, including Decision Tree (DT), eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and K-Nearest Neighbors (KNN), are used to evaluate the effect of each parameter on prediction performance, using the SHapley Additive exPlanations (SHAP) method to determine feature importance. Based on the analysis, removing a less influential feature and introducing a new feature significantly improved the model’s predictive capabilities. By applying the min–max scaling technique, it is ensured that all features contribute proportionally to the model learning process. Additionally, multiple ML models─Linear Regression (LR), KNN, DT, RF, Adaptive Boosting (AdaBoost), Gradient Boosting (GB), Support Vector Regression (SVR), XGBoost, Bagging, Partial Least Squares (PLS), and Ridge Regression─are applied to the data set and their performance in predicting the sensor output current is compared. To further enhance prediction performance, a novel ensemble model is proposed that integrates DT, RF, GB, XGBoost, and Bagging regressors, leveraging their combined strengths to offset individual weaknesses. The main benefit of this work lies in its ability to enhance MIP-based sensor performance by developing a novel stacking regressor ensemble model, which improves prediction performance and reliability. This methodology is broadly applicable to the development of other sensors with different transducers and sensing elements. Through extensive simulation results, the proposed stacking regressor ensemble model demonstrated superior predictive performance compared to individual ML models. The model achieved an R-squared (R2) of 0.993, significantly reducing the root-mean-square error (RMSE) to 0.436 and the mean absolute error (MAE) to 0.244. These improvements enhanced sensitivity and reliability of the MIP-based electrochemical sensor, demonstrating a substantial performance gain over individual ML models.\",\"PeriodicalId\":24,\"journal\":{\"name\":\"ACS Sensors\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":8.2000,\"publicationDate\":\"2025-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Sensors\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acssensors.5c00364\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Sensors","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acssensors.5c00364","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}

引用次数: 0

摘要

电化学传感器的性能受多种因素的影响。为了提高这些传感器的有效性，在这些因素之间找到适当的平衡是至关重要的。研究人员和工程师不断探索创新的方法来提高灵敏度、选择性和可靠性。机器学习（ML）技术通过建立参数及其影响之间的定量关系，促进了传感器性能的分析和预测建模。这项工作提出了一个基于分子印迹聚合物（MIP）的传感器用于检测阿霉素（Dox）的案例研究，强调使用基于ml的集成模型来提高性能和可靠性。四种ML模型，包括决策树（DT）、极端梯度增强（XGBoost）、随机森林（RF）和k近邻（KNN），用于评估每个参数对预测性能的影响，使用SHapley加性解释（SHAP）方法确定特征重要性。在分析的基础上，去除影响较小的特征并引入新特征显著提高了模型的预测能力。通过应用最小-最大缩放技术，确保所有特征对模型学习过程的贡献成比例。此外，将多个ML模型──线性回归（LR）、KNN、DT、RF、自适应增强（AdaBoost）、梯度增强（GB）、支持向量回归（SVR）、XGBoost、Bagging、偏最小二乘（PLS）和Ridge回归──应用于数据集，并比较了它们在预测传感器输出电流方面的性能。为了进一步提高预测性能，本文提出了一种新的集成模型，该模型集成了DT、RF、GB、XGBoost和Bagging回归量，利用它们的综合优势来抵消单个回归量的不足。这项工作的主要好处在于它能够通过开发一种新的堆叠回归集合模型来提高基于mip的传感器性能，从而提高预测性能和可靠性。这种方法广泛适用于开发其他具有不同传感器和传感元件的传感器。通过广泛的仿真结果，与单个ML模型相比，所提出的叠加回归集合模型显示出优越的预测性能。该模型的r平方（R2）为0.993，显著降低了均方根误差（RMSE）至0.436，平均绝对误差（MAE）至0.244。这些改进提高了基于mip的电化学传感器的灵敏度和可靠性，与单个ML模型相比，显示出实质性的性能增益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Enhancing the Predictive Performance of Molecularly Imprinted Polymer-Based Electrochemical Sensors Using a Stacking Regressor Ensemble of Machine Learning Models

查看原文本刊更多论文

Enhancing the Predictive Performance of Molecularly Imprinted Polymer-Based Electrochemical Sensors Using a Stacking Regressor Ensemble of Machine Learning Models

The performance of electrochemical sensors is influenced by various factors. To enhance the effectiveness of these sensors, it is crucial to find the right balance among these factors. Researchers and engineers continually explore innovative approaches to enhance sensitivity, selectivity, and reliability. Machine learning (ML) techniques facilitate the analysis and predictive modeling of sensor performance by establishing quantitative relationships between parameters and their effects. This work presents a case study on developing a molecularly imprinted polymer (MIP)-based sensor for detecting doxorubicin (Dox), emphasizing the use of ML-based ensemble models to improve performance and reliability. Four ML models, including Decision Tree (DT), eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and K-Nearest Neighbors (KNN), are used to evaluate the effect of each parameter on prediction performance, using the SHapley Additive exPlanations (SHAP) method to determine feature importance. Based on the analysis, removing a less influential feature and introducing a new feature significantly improved the model’s predictive capabilities. By applying the min–max scaling technique, it is ensured that all features contribute proportionally to the model learning process. Additionally, multiple ML models─Linear Regression (LR), KNN, DT, RF, Adaptive Boosting (AdaBoost), Gradient Boosting (GB), Support Vector Regression (SVR), XGBoost, Bagging, Partial Least Squares (PLS), and Ridge Regression─are applied to the data set and their performance in predicting the sensor output current is compared. To further enhance prediction performance, a novel ensemble model is proposed that integrates DT, RF, GB, XGBoost, and Bagging regressors, leveraging their combined strengths to offset individual weaknesses. The main benefit of this work lies in its ability to enhance MIP-based sensor performance by developing a novel stacking regressor ensemble model, which improves prediction performance and reliability. This methodology is broadly applicable to the development of other sensors with different transducers and sensing elements. Through extensive simulation results, the proposed stacking regressor ensemble model demonstrated superior predictive performance compared to individual ML models. The model achieved an R-squared (R²) of 0.993, significantly reducing the root-mean-square error (RMSE) to 0.436 and the mean absolute error (MAE) to 0.244. These improvements enhanced sensitivity and reliability of the MIP-based electrochemical sensor, demonstrating a substantial performance gain over individual ML models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACS Sensors Chemical Engineering-Bioengineering

CiteScore

14.50

自引率

3.40%

发文量

372

期刊介绍： ACS Sensors is a peer-reviewed research journal that focuses on the dissemination of new and original knowledge in the field of sensor science, particularly those that selectively sense chemical or biological species or processes. The journal covers a broad range of topics, including but not limited to biosensors, chemical sensors, gas sensors, intracellular sensors, single molecule sensors, cell chips, and microfluidic devices. It aims to publish articles that address conceptual advances in sensing technology applicable to various types of analytes or application papers that report on the use of existing sensing concepts in new ways or for new analytes.