{"title":"可解释机器学习预测离子液体毒性","authors":"Haijun Feng , Li Jiajia , Zhou Jian","doi":"10.1016/j.cjche.2025.04.018","DOIUrl":null,"url":null,"abstract":"<div><div>The potential toxicity of ionic liquids (ILs) affects their applications; how to control the toxicity is one of the key issues in their applications. To understand its toxicity structure relationship and promote its greener application, six different machine learning algorithms, including Bagging, Adaptive Boosting (AdaBoost), Gradient Boosting (GBoost), Stacking, Voting and Categorical Boosting (CatBoost), are established to model the toxicity of ILs on four distinct datasets including Leukemia rat cell line IPC-81 (IPC-81), Acetylcholinesterase (AChE), <em>Escherichia coli</em> (<em>E.coli</em>) and <em>Vibrio fischeri</em>. Molecular descriptors obtained from the simplified molecular input line entry system (SMILES) are used to characterize ILs. All models are assessed by the mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (<em>R</em><sup>2</sup>). Additionally, an interpretation model based on SHapley Additive exPlanations (SHAP) is built to determine the positive and negative effects of each molecular feature on toxicity. With additional parameters and complexity, the Catboost model outperforms the other models, making it a more reliable model for ILs' toxicity prediction. The results of the model's interpretation indicate that the most significant positive features, SMR_VSA5, PEOE_VSA8, Kappa2, PEOE_VSA6, SMR_VSA5, PEOE_VSA6 and EState_VSA1, can increase the toxicity of ILs as their levels rise, while the most significant negative features, VSA_EState7, EState_VSA8, PEOE_VSA9 and FpDensityMorgan1, can decrease the toxicity as their levels rise. Also, an IL's toxicity will grow as its average molecular weight and number of pyridine rings increase, whereas its toxicity will decrease as its hydrogen bond acceptors increase. This finding offers a theoretical foundation for rapid screening and synthesis of environmentally-benign ILs.</div></div>","PeriodicalId":9966,"journal":{"name":"Chinese Journal of Chemical Engineering","volume":"84 ","pages":"Pages 201-210"},"PeriodicalIF":3.7000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of ionic liquid toxicity by interpretable machine learning\",\"authors\":\"Haijun Feng , Li Jiajia , Zhou Jian\",\"doi\":\"10.1016/j.cjche.2025.04.018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The potential toxicity of ionic liquids (ILs) affects their applications; how to control the toxicity is one of the key issues in their applications. To understand its toxicity structure relationship and promote its greener application, six different machine learning algorithms, including Bagging, Adaptive Boosting (AdaBoost), Gradient Boosting (GBoost), Stacking, Voting and Categorical Boosting (CatBoost), are established to model the toxicity of ILs on four distinct datasets including Leukemia rat cell line IPC-81 (IPC-81), Acetylcholinesterase (AChE), <em>Escherichia coli</em> (<em>E.coli</em>) and <em>Vibrio fischeri</em>. Molecular descriptors obtained from the simplified molecular input line entry system (SMILES) are used to characterize ILs. All models are assessed by the mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (<em>R</em><sup>2</sup>). Additionally, an interpretation model based on SHapley Additive exPlanations (SHAP) is built to determine the positive and negative effects of each molecular feature on toxicity. With additional parameters and complexity, the Catboost model outperforms the other models, making it a more reliable model for ILs' toxicity prediction. The results of the model's interpretation indicate that the most significant positive features, SMR_VSA5, PEOE_VSA8, Kappa2, PEOE_VSA6, SMR_VSA5, PEOE_VSA6 and EState_VSA1, can increase the toxicity of ILs as their levels rise, while the most significant negative features, VSA_EState7, EState_VSA8, PEOE_VSA9 and FpDensityMorgan1, can decrease the toxicity as their levels rise. Also, an IL's toxicity will grow as its average molecular weight and number of pyridine rings increase, whereas its toxicity will decrease as its hydrogen bond acceptors increase. This finding offers a theoretical foundation for rapid screening and synthesis of environmentally-benign ILs.</div></div>\",\"PeriodicalId\":9966,\"journal\":{\"name\":\"Chinese Journal of Chemical Engineering\",\"volume\":\"84 \",\"pages\":\"Pages 201-210\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chinese Journal of Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1004954125002125\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Journal of Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1004954125002125","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
Prediction of ionic liquid toxicity by interpretable machine learning
The potential toxicity of ionic liquids (ILs) affects their applications; how to control the toxicity is one of the key issues in their applications. To understand its toxicity structure relationship and promote its greener application, six different machine learning algorithms, including Bagging, Adaptive Boosting (AdaBoost), Gradient Boosting (GBoost), Stacking, Voting and Categorical Boosting (CatBoost), are established to model the toxicity of ILs on four distinct datasets including Leukemia rat cell line IPC-81 (IPC-81), Acetylcholinesterase (AChE), Escherichia coli (E.coli) and Vibrio fischeri. Molecular descriptors obtained from the simplified molecular input line entry system (SMILES) are used to characterize ILs. All models are assessed by the mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (R2). Additionally, an interpretation model based on SHapley Additive exPlanations (SHAP) is built to determine the positive and negative effects of each molecular feature on toxicity. With additional parameters and complexity, the Catboost model outperforms the other models, making it a more reliable model for ILs' toxicity prediction. The results of the model's interpretation indicate that the most significant positive features, SMR_VSA5, PEOE_VSA8, Kappa2, PEOE_VSA6, SMR_VSA5, PEOE_VSA6 and EState_VSA1, can increase the toxicity of ILs as their levels rise, while the most significant negative features, VSA_EState7, EState_VSA8, PEOE_VSA9 and FpDensityMorgan1, can decrease the toxicity as their levels rise. Also, an IL's toxicity will grow as its average molecular weight and number of pyridine rings increase, whereas its toxicity will decrease as its hydrogen bond acceptors increase. This finding offers a theoretical foundation for rapid screening and synthesis of environmentally-benign ILs.
期刊介绍:
The Chinese Journal of Chemical Engineering (Monthly, started in 1982) is the official journal of the Chemical Industry and Engineering Society of China and published by the Chemical Industry Press Co. Ltd. The aim of the journal is to develop the international exchange of scientific and technical information in the field of chemical engineering. It publishes original research papers that cover the major advancements and achievements in chemical engineering in China as well as some articles from overseas contributors.
The topics of journal include chemical engineering, chemical technology, biochemical engineering, energy and environmental engineering and other relevant fields. Papers are published on the basis of their relevance to theoretical research, practical application or potential uses in the industry as Research Papers, Communications, Reviews and Perspectives. Prominent domestic and overseas chemical experts and scholars have been invited to form an International Advisory Board and the Editorial Committee. It enjoys recognition among Chinese academia and industry as a reliable source of information of what is going on in chemical engineering research, both domestic and abroad.