Brian Bwanya , Saad Lodhi , Theo M. de Kok , Luiz Ladeira , Marcha CT Verheijen , Danyel GJ Jennen , Florian Caiment
{"title":"Machine learning classification of steatogenic compounds using toxicogenomics profiles","authors":"Brian Bwanya , Saad Lodhi , Theo M. de Kok , Luiz Ladeira , Marcha CT Verheijen , Danyel GJ Jennen , Florian Caiment","doi":"10.1016/j.tox.2025.154237","DOIUrl":null,"url":null,"abstract":"<div><div>The transition toward new approach methodologies for toxicity testing has accelerated the development of computational models that utilize transcriptomic data to predict chemical-induced adverse effects. Here, we applied supervised machine learning to gene expression data derived from primary human hepatocytes and rat liver models (<em>in vitro</em> and <em>in vivo</em>) to predict drug-induced hepatic steatosis. We evaluated five machine learning classifiers using microarray data from the Open TG-GATEs database. Among these, support vector machine (SVM) consistently achieved the highest performance, with area under the receiver operating characteristic curve (ROC-AUC) of 0.820 in primary human hepatocytes, 0.975 in the rat <em>in vitro</em> model, and 0.966 in the rat <em>in vivo</em> model. To gain mechanistic insights, we functionally profiled the top-ranked predictive genes. Enrichment analyses revealed strong associations with lipid metabolism, mitochondrial function, insulin signalling, oxidative stress, all biological processes central to steatosis pathogenesis. Key predictive genes such as <em>CYP1A1</em>, <em>PLIN2</em>, and <em>GCK</em> mapped to lipid metabolism networks and liver disease annotations, while others highlighted novel transcriptomics signals. Integration with differentially expressed genes and known steatosis markers highlighted both overlapping and distinct molecular features, suggesting that machine learning models capture biologically relevant signals. These findings demonstrate the potential of machine learning models guided by transcriptomic data to identify early molecular signatures of drug-induced hepatic steatosis. The support vector machine model’s strong predictive accuracy across species highlights its promise as a scalable and interpretable tool for chemical risk assessment. As data limitations in human toxicology persist, expanding high-quality transcriptomic resources will be critical to further advance non-animal approaches in regulatory toxicology.</div></div>","PeriodicalId":23159,"journal":{"name":"Toxicology","volume":"517 ","pages":"Article 154237"},"PeriodicalIF":4.6000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Toxicology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0300483X25001969","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0
Abstract
The transition toward new approach methodologies for toxicity testing has accelerated the development of computational models that utilize transcriptomic data to predict chemical-induced adverse effects. Here, we applied supervised machine learning to gene expression data derived from primary human hepatocytes and rat liver models (in vitro and in vivo) to predict drug-induced hepatic steatosis. We evaluated five machine learning classifiers using microarray data from the Open TG-GATEs database. Among these, support vector machine (SVM) consistently achieved the highest performance, with area under the receiver operating characteristic curve (ROC-AUC) of 0.820 in primary human hepatocytes, 0.975 in the rat in vitro model, and 0.966 in the rat in vivo model. To gain mechanistic insights, we functionally profiled the top-ranked predictive genes. Enrichment analyses revealed strong associations with lipid metabolism, mitochondrial function, insulin signalling, oxidative stress, all biological processes central to steatosis pathogenesis. Key predictive genes such as CYP1A1, PLIN2, and GCK mapped to lipid metabolism networks and liver disease annotations, while others highlighted novel transcriptomics signals. Integration with differentially expressed genes and known steatosis markers highlighted both overlapping and distinct molecular features, suggesting that machine learning models capture biologically relevant signals. These findings demonstrate the potential of machine learning models guided by transcriptomic data to identify early molecular signatures of drug-induced hepatic steatosis. The support vector machine model’s strong predictive accuracy across species highlights its promise as a scalable and interpretable tool for chemical risk assessment. As data limitations in human toxicology persist, expanding high-quality transcriptomic resources will be critical to further advance non-animal approaches in regulatory toxicology.
期刊介绍:
Toxicology is an international, peer-reviewed journal that publishes only the highest quality original scientific research and critical reviews describing hypothesis-based investigations into mechanisms of toxicity associated with exposures to xenobiotic chemicals, particularly as it relates to human health. In this respect "mechanisms" is defined on both the macro (e.g. physiological, biological, kinetic, species, sex, etc.) and molecular (genomic, transcriptomic, metabolic, etc.) scale. Emphasis is placed on findings that identify novel hazards and that can be extrapolated to exposures and mechanisms that are relevant to estimating human risk. Toxicology also publishes brief communications, personal commentaries and opinion articles, as well as concise expert reviews on contemporary topics. All research and review articles published in Toxicology are subject to rigorous peer review. Authors are asked to contact the Editor-in-Chief prior to submitting review articles or commentaries for consideration for publication in Toxicology.