{"title":"Ecotoxicity prediction of chemical compounds using machine learning and different molecular structure representations","authors":"Michał Marek, Rafał Kurczab","doi":"10.1016/j.greeac.2025.100273","DOIUrl":null,"url":null,"abstract":"<div><div>Advancements in computational tools have facilitated interdisciplinary approaches in toxicology, enabling chemists to explore the toxicity and ecotoxicity of chemical compounds while minimizing ethically questionable or hazardous methods. This paper presents the development of models for predicting chemical ecotoxicity (HC<sub>50</sub>) based on machine learning algorithms and different molecular representations. A comprehensive set of descriptors was employed, including 100 molecular descriptors calculated using RDKit, 15 molecular connectivity (Chi) indices combined with shape (Kappa) indices, as well as MACCS and ECFP4 binary molecular fingerprints. The best model achieved an average RMSE of 0.740, an R² of 0.708, and an MAE of 0.546 through ten-fold cross-validation. The analysis of critical molecular descriptors identified logP, molar mass, heavy atom molar mass, Ipc, and the number of valence electrons as significant contributors to prediction of chemical ecotoxicity. This model not only facilitates ecotoxicity prediction but also provides valuable insights into the physicochemical properties influencing a molecule's ecotoxic profile, highlighting the potential of in silico approaches for ethical and efficient toxicology research.</div></div>","PeriodicalId":100594,"journal":{"name":"Green Analytical Chemistry","volume":"13 ","pages":"Article 100273"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Green Analytical Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772577425000692","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Advancements in computational tools have facilitated interdisciplinary approaches in toxicology, enabling chemists to explore the toxicity and ecotoxicity of chemical compounds while minimizing ethically questionable or hazardous methods. This paper presents the development of models for predicting chemical ecotoxicity (HC50) based on machine learning algorithms and different molecular representations. A comprehensive set of descriptors was employed, including 100 molecular descriptors calculated using RDKit, 15 molecular connectivity (Chi) indices combined with shape (Kappa) indices, as well as MACCS and ECFP4 binary molecular fingerprints. The best model achieved an average RMSE of 0.740, an R² of 0.708, and an MAE of 0.546 through ten-fold cross-validation. The analysis of critical molecular descriptors identified logP, molar mass, heavy atom molar mass, Ipc, and the number of valence electrons as significant contributors to prediction of chemical ecotoxicity. This model not only facilitates ecotoxicity prediction but also provides valuable insights into the physicochemical properties influencing a molecule's ecotoxic profile, highlighting the potential of in silico approaches for ethical and efficient toxicology research.