利用带数据增强功能的元集合学习框架加强离子液体毒性预测

Safa Sadaghiyanfam , Hiqmet Kamberaj , Yalcin Isler
{"title":"利用带数据增强功能的元集合学习框架加强离子液体毒性预测","authors":"Safa Sadaghiyanfam ,&nbsp;Hiqmet Kamberaj ,&nbsp;Yalcin Isler","doi":"10.1016/j.aichem.2025.100087","DOIUrl":null,"url":null,"abstract":"<div><div>Ionic liquids are unique in their properties and potential to be green solvents. Still, the toxicity concern remains, compelling the need for excellent predictive models for safe design and application. This work reports the introduction of a general, robust meta-ensemble learning framework for predicting the toxicity of ionic liquids using molecular descriptors and fingerprints. The proposed model incorporates the Random Forest, Support Vector Regression, Categorical Boosting, Chemical Convolutional Neural Network as a base classifier and an Extreme Gradient Boosting meta-classifier. The framework uses Recursive Feature Elimination for feature selection and GridSearchCV for tuning the best hyperparameters. Without augmentation of the data, the RMSE equals 0.38, MAE equals 0.29, coefficient of determination (<span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>) equals 0.87, and Pearson correlation equals 0.94. Data augmentation further improved model performance: RMSE = 0.06, MAE = 0.024, <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> = 0.99, and a Pearson correlation of 0.99. In addition, this indicates that the data-augmented model outperforms all existing models with prominence in its strength and prediction capacity. Thus, the present framework provides a superior tool for computer-aided molecular design of safer and more effective ionic liquids.</div></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"3 1","pages":"Article 100087"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced prediction of ionic liquid toxicity using a meta-ensemble learning framework with data augmentation\",\"authors\":\"Safa Sadaghiyanfam ,&nbsp;Hiqmet Kamberaj ,&nbsp;Yalcin Isler\",\"doi\":\"10.1016/j.aichem.2025.100087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Ionic liquids are unique in their properties and potential to be green solvents. Still, the toxicity concern remains, compelling the need for excellent predictive models for safe design and application. This work reports the introduction of a general, robust meta-ensemble learning framework for predicting the toxicity of ionic liquids using molecular descriptors and fingerprints. The proposed model incorporates the Random Forest, Support Vector Regression, Categorical Boosting, Chemical Convolutional Neural Network as a base classifier and an Extreme Gradient Boosting meta-classifier. The framework uses Recursive Feature Elimination for feature selection and GridSearchCV for tuning the best hyperparameters. Without augmentation of the data, the RMSE equals 0.38, MAE equals 0.29, coefficient of determination (<span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>) equals 0.87, and Pearson correlation equals 0.94. Data augmentation further improved model performance: RMSE = 0.06, MAE = 0.024, <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> = 0.99, and a Pearson correlation of 0.99. In addition, this indicates that the data-augmented model outperforms all existing models with prominence in its strength and prediction capacity. Thus, the present framework provides a superior tool for computer-aided molecular design of safer and more effective ionic liquids.</div></div>\",\"PeriodicalId\":72302,\"journal\":{\"name\":\"Artificial intelligence chemistry\",\"volume\":\"3 1\",\"pages\":\"Article 100087\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial intelligence chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949747725000041\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949747725000041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

离子液体具有独特的性质和成为绿色溶剂的潜力。尽管如此,毒性问题仍然存在,迫切需要为安全设计和应用提供优秀的预测模型。这项工作报告了一个通用的、健壮的元集成学习框架的引入,用于使用分子描述符和指纹来预测离子液体的毒性。该模型结合了随机森林、支持向量回归、分类增强、化学卷积神经网络作为基本分类器和极端梯度增强元分类器。该框架使用递归特征消去进行特征选择,使用GridSearchCV优化最佳超参数。在不加值的情况下,RMSE = 0.38, MAE = 0.29,决定系数(R2) = 0.87, Pearson相关= 0.94。数据扩充进一步提高了模型性能:RMSE = 0.06, MAE = 0.024, R2 = 0.99, Pearson相关系数为0.99。此外,这表明数据增强模型在强度和预测能力方面优于所有现有模型。因此,本框架为更安全、更有效的离子液体的计算机辅助分子设计提供了一个优越的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Enhanced prediction of ionic liquid toxicity using a meta-ensemble learning framework with data augmentation

Enhanced prediction of ionic liquid toxicity using a meta-ensemble learning framework with data augmentation
Ionic liquids are unique in their properties and potential to be green solvents. Still, the toxicity concern remains, compelling the need for excellent predictive models for safe design and application. This work reports the introduction of a general, robust meta-ensemble learning framework for predicting the toxicity of ionic liquids using molecular descriptors and fingerprints. The proposed model incorporates the Random Forest, Support Vector Regression, Categorical Boosting, Chemical Convolutional Neural Network as a base classifier and an Extreme Gradient Boosting meta-classifier. The framework uses Recursive Feature Elimination for feature selection and GridSearchCV for tuning the best hyperparameters. Without augmentation of the data, the RMSE equals 0.38, MAE equals 0.29, coefficient of determination (R2) equals 0.87, and Pearson correlation equals 0.94. Data augmentation further improved model performance: RMSE = 0.06, MAE = 0.024, R2 = 0.99, and a Pearson correlation of 0.99. In addition, this indicates that the data-augmented model outperforms all existing models with prominence in its strength and prediction capacity. Thus, the present framework provides a superior tool for computer-aided molecular design of safer and more effective ionic liquids.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial intelligence chemistry
Artificial intelligence chemistry Chemistry (General)
自引率
0.00%
发文量
0
审稿时长
21 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信