{"title":"Determining the quality of imprinted polymers using diverse feature selections methods, Ada Boost and Gradient boosting algorithms","authors":"Bita Yarahmadi , Seyed Majid Hashemianzadeh","doi":"10.1016/j.rinma.2025.100722","DOIUrl":null,"url":null,"abstract":"<div><div>The use of polymer informatics is an appropriate solution to overcome the problems of optimizing the synthesis conditions of polymers, which has attracted the attention of many researchers. The aim of this study is to develop a comprehensive model to predict the imprinting factor (IF) value for different template molecules. Therefore, molecularly imprinted polymers (MIPs) were synthesized for various template molecules, their IF values were calculated, and then a data set table was prepared. By utilizing the Ada Boost algorithm, Gradient boosting algorithm and various feature selection methods such as forward selection, mutual information, correlation statistics, chi-square and Recursive Feature Elimination (RFE), an accurate model was created to predict IF. The results showed that using the Ada Boost algorithm and the RFE feature selection method, improved modeling accuracy achieving a maximum R<sup>2</sup> score (R<sup>2</sup> = 0.937, Adjusted R<sup>2</sup> = 0.936) and minimum errors (MAE = 0.915, MSE = 7.052). The accuracy of the models obtained using the gradient boosting algorithm was lower compared to the Ada Boost algorithm, and the maximum accuracy of the model was obtained using the mutual information method. This paper provides specific quantitative insights into how each feature selection method affected prediction performance, supported by relevant metrics.</div></div>","PeriodicalId":101087,"journal":{"name":"Results in Materials","volume":"27 ","pages":"Article 100722"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Materials","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590048X25000676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The use of polymer informatics is an appropriate solution to overcome the problems of optimizing the synthesis conditions of polymers, which has attracted the attention of many researchers. The aim of this study is to develop a comprehensive model to predict the imprinting factor (IF) value for different template molecules. Therefore, molecularly imprinted polymers (MIPs) were synthesized for various template molecules, their IF values were calculated, and then a data set table was prepared. By utilizing the Ada Boost algorithm, Gradient boosting algorithm and various feature selection methods such as forward selection, mutual information, correlation statistics, chi-square and Recursive Feature Elimination (RFE), an accurate model was created to predict IF. The results showed that using the Ada Boost algorithm and the RFE feature selection method, improved modeling accuracy achieving a maximum R2 score (R2 = 0.937, Adjusted R2 = 0.936) and minimum errors (MAE = 0.915, MSE = 7.052). The accuracy of the models obtained using the gradient boosting algorithm was lower compared to the Ada Boost algorithm, and the maximum accuracy of the model was obtained using the mutual information method. This paper provides specific quantitative insights into how each feature selection method affected prediction performance, supported by relevant metrics.