Determining the quality of imprinted polymers using diverse feature selections methods, Ada Boost and Gradient boosting algorithms

Bita Yarahmadi , Seyed Majid Hashemianzadeh
{"title":"Determining the quality of imprinted polymers using diverse feature selections methods, Ada Boost and Gradient boosting algorithms","authors":"Bita Yarahmadi ,&nbsp;Seyed Majid Hashemianzadeh","doi":"10.1016/j.rinma.2025.100722","DOIUrl":null,"url":null,"abstract":"<div><div>The use of polymer informatics is an appropriate solution to overcome the problems of optimizing the synthesis conditions of polymers, which has attracted the attention of many researchers. The aim of this study is to develop a comprehensive model to predict the imprinting factor (IF) value for different template molecules. Therefore, molecularly imprinted polymers (MIPs) were synthesized for various template molecules, their IF values were calculated, and then a data set table was prepared. By utilizing the Ada Boost algorithm, Gradient boosting algorithm and various feature selection methods such as forward selection, mutual information, correlation statistics, chi-square and Recursive Feature Elimination (RFE), an accurate model was created to predict IF. The results showed that using the Ada Boost algorithm and the RFE feature selection method, improved modeling accuracy achieving a maximum R<sup>2</sup> score (R<sup>2</sup> = 0.937, Adjusted R<sup>2</sup> = 0.936) and minimum errors (MAE = 0.915, MSE = 7.052). The accuracy of the models obtained using the gradient boosting algorithm was lower compared to the Ada Boost algorithm, and the maximum accuracy of the model was obtained using the mutual information method. This paper provides specific quantitative insights into how each feature selection method affected prediction performance, supported by relevant metrics.</div></div>","PeriodicalId":101087,"journal":{"name":"Results in Materials","volume":"27 ","pages":"Article 100722"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Materials","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590048X25000676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The use of polymer informatics is an appropriate solution to overcome the problems of optimizing the synthesis conditions of polymers, which has attracted the attention of many researchers. The aim of this study is to develop a comprehensive model to predict the imprinting factor (IF) value for different template molecules. Therefore, molecularly imprinted polymers (MIPs) were synthesized for various template molecules, their IF values were calculated, and then a data set table was prepared. By utilizing the Ada Boost algorithm, Gradient boosting algorithm and various feature selection methods such as forward selection, mutual information, correlation statistics, chi-square and Recursive Feature Elimination (RFE), an accurate model was created to predict IF. The results showed that using the Ada Boost algorithm and the RFE feature selection method, improved modeling accuracy achieving a maximum R2 score (R2 = 0.937, Adjusted R2 = 0.936) and minimum errors (MAE = 0.915, MSE = 7.052). The accuracy of the models obtained using the gradient boosting algorithm was lower compared to the Ada Boost algorithm, and the maximum accuracy of the model was obtained using the mutual information method. This paper provides specific quantitative insights into how each feature selection method affected prediction performance, supported by relevant metrics.
使用不同的特征选择方法,Ada Boost和梯度增强算法来确定印迹聚合物的质量
聚合物信息学的应用是克服聚合物合成条件优化问题的一种合适的解决方案,引起了许多研究者的关注。本研究的目的是建立一个综合模型来预测不同模板分子的印迹因子(IF)值。因此,合成了不同模板分子的分子印迹聚合物(MIPs),计算了它们的中频值,并制作了数据集表。利用Ada Boost算法、Gradient boosting算法以及正向选择、互信息、相关统计、卡方和递归特征消除(Recursive feature Elimination, RFE)等多种特征选择方法,建立了准确的中频预测模型。结果表明,采用Ada Boost算法和RFE特征选择方法可以提高建模精度,达到最大R2得分(R2 = 0.937,调整后R2 = 0.936)和最小误差(MAE = 0.915, MSE = 7.052)。与Ada Boost算法相比,梯度增强算法得到的模型精度较低,互信息方法得到的模型精度最大。本文在相关度量的支持下,对每种特征选择方法如何影响预测性能提供了具体的定量见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信