{"title":"Machine learning prediction of dioxin lipophilicity and key feature Identification","authors":"Yingwei Wang, Yufei Li","doi":"10.1016/j.comptc.2024.115032","DOIUrl":null,"url":null,"abstract":"<div><div>Dioxins are potent exogenous ligands for the aryl hydrocarbon receptor (AHR) within human cell membranes. Their lipophilicity is a critical factor influencing the immunotoxicity mediated by AHR. This study utilizes experimental data on the lipophilicity of certain PCDDs as the dependent variable, and molecular descriptors of PCDDs as independent variables, to construct five machine learning models for predicting PCDDs’ lipophilicity. The evaluation metrics of these models indicate that the XGBoost model exhibits excellent predictive performance, achieving an 86% accuracy in predicting the logKow values of 75 PCDDs. An XGBoost-Bayesian stacked model was developed by employing a stacking algorithm, enhancing the prediction accuracy to 96%. This improved model was successfully applied to predict the log<em>K</em><sub>ow</sub> values of 175 PCDFs and validated through molecular membrane dynamics. The SHAP method identified key molecular descriptors influencing dioxin lipophilicity. This study offers a theoretical basis for investigating the toxicity of dioxins via AHR receptors.</div></div>","PeriodicalId":284,"journal":{"name":"Computational and Theoretical Chemistry","volume":"1244 ","pages":"Article 115032"},"PeriodicalIF":3.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational and Theoretical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210271X24005711","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Dioxins are potent exogenous ligands for the aryl hydrocarbon receptor (AHR) within human cell membranes. Their lipophilicity is a critical factor influencing the immunotoxicity mediated by AHR. This study utilizes experimental data on the lipophilicity of certain PCDDs as the dependent variable, and molecular descriptors of PCDDs as independent variables, to construct five machine learning models for predicting PCDDs’ lipophilicity. The evaluation metrics of these models indicate that the XGBoost model exhibits excellent predictive performance, achieving an 86% accuracy in predicting the logKow values of 75 PCDDs. An XGBoost-Bayesian stacked model was developed by employing a stacking algorithm, enhancing the prediction accuracy to 96%. This improved model was successfully applied to predict the logKow values of 175 PCDFs and validated through molecular membrane dynamics. The SHAP method identified key molecular descriptors influencing dioxin lipophilicity. This study offers a theoretical basis for investigating the toxicity of dioxins via AHR receptors.
期刊介绍:
Computational and Theoretical Chemistry publishes high quality, original reports of significance in computational and theoretical chemistry including those that deal with problems of structure, properties, energetics, weak interactions, reaction mechanisms, catalysis, and reaction rates involving atoms, molecules, clusters, surfaces, and bulk matter.