Machine learning classification of steatogenic compounds using toxicogenomics profiles

IF 4.6 3区医学 Q1 PHARMACOLOGY & PHARMACY

Toxicology Pub Date : 2025-07-18 DOI:10.1016/j.tox.2025.154237

Brian Bwanya , Saad Lodhi , Theo M. de Kok , Luiz Ladeira , Marcha CT Verheijen , Danyel GJ Jennen , Florian Caiment

{"title":"Machine learning classification of steatogenic compounds using toxicogenomics profiles","authors":"Brian Bwanya , Saad Lodhi , Theo M. de Kok , Luiz Ladeira , Marcha CT Verheijen , Danyel GJ Jennen , Florian Caiment","doi":"10.1016/j.tox.2025.154237","DOIUrl":null,"url":null,"abstract":"<div><div>The transition toward new approach methodologies for toxicity testing has accelerated the development of computational models that utilize transcriptomic data to predict chemical-induced adverse effects. Here, we applied supervised machine learning to gene expression data derived from primary human hepatocytes and rat liver models (<em>in vitro</em> and <em>in vivo</em>) to predict drug-induced hepatic steatosis. We evaluated five machine learning classifiers using microarray data from the Open TG-GATEs database. Among these, support vector machine (SVM) consistently achieved the highest performance, with area under the receiver operating characteristic curve (ROC-AUC) of 0.820 in primary human hepatocytes, 0.975 in the rat <em>in vitro</em> model, and 0.966 in the rat <em>in vivo</em> model. To gain mechanistic insights, we functionally profiled the top-ranked predictive genes. Enrichment analyses revealed strong associations with lipid metabolism, mitochondrial function, insulin signalling, oxidative stress, all biological processes central to steatosis pathogenesis. Key predictive genes such as <em>CYP1A1</em>, <em>PLIN2</em>, and <em>GCK</em> mapped to lipid metabolism networks and liver disease annotations, while others highlighted novel transcriptomics signals. Integration with differentially expressed genes and known steatosis markers highlighted both overlapping and distinct molecular features, suggesting that machine learning models capture biologically relevant signals. These findings demonstrate the potential of machine learning models guided by transcriptomic data to identify early molecular signatures of drug-induced hepatic steatosis. The support vector machine model’s strong predictive accuracy across species highlights its promise as a scalable and interpretable tool for chemical risk assessment. As data limitations in human toxicology persist, expanding high-quality transcriptomic resources will be critical to further advance non-animal approaches in regulatory toxicology.</div></div>","PeriodicalId":23159,"journal":{"name":"Toxicology","volume":"517 ","pages":"Article 154237"},"PeriodicalIF":4.6000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Toxicology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0300483X25001969","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

Abstract

The transition toward new approach methodologies for toxicity testing has accelerated the development of computational models that utilize transcriptomic data to predict chemical-induced adverse effects. Here, we applied supervised machine learning to gene expression data derived from primary human hepatocytes and rat liver models (in vitro and in vivo) to predict drug-induced hepatic steatosis. We evaluated five machine learning classifiers using microarray data from the Open TG-GATEs database. Among these, support vector machine (SVM) consistently achieved the highest performance, with area under the receiver operating characteristic curve (ROC-AUC) of 0.820 in primary human hepatocytes, 0.975 in the rat in vitro model, and 0.966 in the rat in vivo model. To gain mechanistic insights, we functionally profiled the top-ranked predictive genes. Enrichment analyses revealed strong associations with lipid metabolism, mitochondrial function, insulin signalling, oxidative stress, all biological processes central to steatosis pathogenesis. Key predictive genes such as CYP1A1, PLIN2, and GCK mapped to lipid metabolism networks and liver disease annotations, while others highlighted novel transcriptomics signals. Integration with differentially expressed genes and known steatosis markers highlighted both overlapping and distinct molecular features, suggesting that machine learning models capture biologically relevant signals. These findings demonstrate the potential of machine learning models guided by transcriptomic data to identify early molecular signatures of drug-induced hepatic steatosis. The support vector machine model’s strong predictive accuracy across species highlights its promise as a scalable and interpretable tool for chemical risk assessment. As data limitations in human toxicology persist, expanding high-quality transcriptomic resources will be critical to further advance non-animal approaches in regulatory toxicology.

查看原文本刊更多论文

使用毒物基因组学档案的机器学习致脂化合物分类。

向毒性测试新方法方法的过渡加速了利用转录组学数据预测化学诱导的不良反应的计算模型的发展。在这里，我们将监督机器学习应用于来自原代人肝细胞和大鼠肝脏模型（体外和体内）的基因表达数据，以预测药物诱导的肝脂肪变性。我们使用Open TG-GATEs数据库中的微阵列数据评估了五种机器学习分类器。其中，支持向量机（support vector machine， SVM）的表现始终是最高的，在原代人肝细胞中的接受者工作特征曲线下面积（ROC-AUC）为0.820，在大鼠体外模型中为0.975，在大鼠体内模型中为0.966。为了深入了解其中的机制，我们对排名靠前的预测基因进行了功能分析。富集分析显示脂肪变性与脂质代谢、线粒体功能、胰岛素信号传导、氧化应激等所有与脂肪变性发病机制相关的生物过程密切相关。关键的预测基因如CYP1A1、PLIN2和GCK映射到脂质代谢网络和肝脏疾病注释，而其他基因则突出了新的转录组学信号。与差异表达基因和已知脂肪变性标记的整合突出了重叠和不同的分子特征，表明机器学习模型捕获了生物学相关信号。这些发现证明了以转录组学数据为指导的机器学习模型在识别药物诱导的肝脂肪变性的早期分子特征方面的潜力。支持向量机模型跨物种的强预测准确性突出了其作为可扩展和可解释的化学品风险评估工具的前景。由于人类毒理学的数据限制仍然存在，扩大高质量的转录组资源对于进一步推进非动物毒理学研究至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Toxicology 医学-毒理学

CiteScore

7.80

自引率

4.40%

发文量

222

审稿时长

23 days

期刊介绍： Toxicology is an international, peer-reviewed journal that publishes only the highest quality original scientific research and critical reviews describing hypothesis-based investigations into mechanisms of toxicity associated with exposures to xenobiotic chemicals, particularly as it relates to human health. In this respect "mechanisms" is defined on both the macro (e.g. physiological, biological, kinetic, species, sex, etc.) and molecular (genomic, transcriptomic, metabolic, etc.) scale. Emphasis is placed on findings that identify novel hazards and that can be extrapolated to exposures and mechanisms that are relevant to estimating human risk. Toxicology also publishes brief communications, personal commentaries and opinion articles, as well as concise expert reviews on contemporary topics. All research and review articles published in Toxicology are subject to rigorous peer review. Authors are asked to contact the Editor-in-Chief prior to submitting review articles or commentaries for consideration for publication in Toxicology.