Miran J Foster , Grace Patlewicz , Imran Shah , Derik E. Haggard , Richard S. Judson , Katie Paul Friedman
{"title":"Evaluating structure-based activity in a high-throughput assay for steroid biosynthesis","authors":"Miran J Foster , Grace Patlewicz , Imran Shah , Derik E. Haggard , Richard S. Judson , Katie Paul Friedman","doi":"10.1016/j.comtox.2022.100245","DOIUrl":null,"url":null,"abstract":"<div><p>Data from a high-throughput human adrenocortical carcinoma assay (HT-H295R) for steroid hormone biosynthesis are available for > 2000 chemicals in single concentration and 654 chemicals in multi-concentration (mc). Previously, a metric describing the effect size of a chemical on the biosynthesis of 11 hormones was derived using mc data referred to as the maximum mean Mahalanobis distance (maxmMd). However, mc HT-H295R assay data remain unavailable for many chemicals. This work leverages existing HT-H295R assay data by constructing structure–activity relationships to make predictions for data-poor chemicals, including: (1) identification of individual structural descriptors, known as ToxPrint chemotypes, associated with increased odds of affecting estrogen or androgen synthesis; (2) a random forest (RF) classifier using physicochemical property descriptors to predict HT-H295R maxmMd binary (positive or negative) outcomes; and, (3) a local approach to predict maxmMd binary outcomes using nearest neighbors (NNs) based on two types of chemical fingerprints (chemotype or Morgan). Individual chemotypes demonstrated high specificity (85–98 %) for modulators of estrogen and androgen synthesis but with low sensitivity. The best RF model for maxmMd classification included 13 predicted physicochemical descriptors, yielding a balanced accuracy (BA) of 71 % with only modest improvement when hundreds of structural features were added. The best two NN models for binary maxmMd prediction demonstrated BAs of 85 and 81 % using chemotype and Morgan fingerprints, respectively. Using an external test set of 6302 chemicals (lacking HT-H295R data), 1241 were identified as putative estrogen and androgen modulators. Combined results across the three classification models (global RF model and two local NN models) predict that 1033 of the 6302 chemicals would be more likely to affect HT-H295R bioactivity. Together, these <em>in silico</em> approaches can efficiently prioritize thousands of untested chemicals for screening to further evaluate their effects on steroid biosynthesis.</p></div>","PeriodicalId":37651,"journal":{"name":"Computational Toxicology","volume":"24 ","pages":"Article 100245"},"PeriodicalIF":3.1000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Toxicology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468111322000330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TOXICOLOGY","Score":null,"Total":0}
引用次数: 1
Abstract
Data from a high-throughput human adrenocortical carcinoma assay (HT-H295R) for steroid hormone biosynthesis are available for > 2000 chemicals in single concentration and 654 chemicals in multi-concentration (mc). Previously, a metric describing the effect size of a chemical on the biosynthesis of 11 hormones was derived using mc data referred to as the maximum mean Mahalanobis distance (maxmMd). However, mc HT-H295R assay data remain unavailable for many chemicals. This work leverages existing HT-H295R assay data by constructing structure–activity relationships to make predictions for data-poor chemicals, including: (1) identification of individual structural descriptors, known as ToxPrint chemotypes, associated with increased odds of affecting estrogen or androgen synthesis; (2) a random forest (RF) classifier using physicochemical property descriptors to predict HT-H295R maxmMd binary (positive or negative) outcomes; and, (3) a local approach to predict maxmMd binary outcomes using nearest neighbors (NNs) based on two types of chemical fingerprints (chemotype or Morgan). Individual chemotypes demonstrated high specificity (85–98 %) for modulators of estrogen and androgen synthesis but with low sensitivity. The best RF model for maxmMd classification included 13 predicted physicochemical descriptors, yielding a balanced accuracy (BA) of 71 % with only modest improvement when hundreds of structural features were added. The best two NN models for binary maxmMd prediction demonstrated BAs of 85 and 81 % using chemotype and Morgan fingerprints, respectively. Using an external test set of 6302 chemicals (lacking HT-H295R data), 1241 were identified as putative estrogen and androgen modulators. Combined results across the three classification models (global RF model and two local NN models) predict that 1033 of the 6302 chemicals would be more likely to affect HT-H295R bioactivity. Together, these in silico approaches can efficiently prioritize thousands of untested chemicals for screening to further evaluate their effects on steroid biosynthesis.
期刊介绍:
Computational Toxicology is an international journal publishing computational approaches that assist in the toxicological evaluation of new and existing chemical substances assisting in their safety assessment. -All effects relating to human health and environmental toxicity and fate -Prediction of toxicity, metabolism, fate and physico-chemical properties -The development of models from read-across, (Q)SARs, PBPK, QIVIVE, Multi-Scale Models -Big Data in toxicology: integration, management, analysis -Implementation of models through AOPs, IATA, TTC -Regulatory acceptance of models: evaluation, verification and validation -From metals, to small organic molecules to nanoparticles -Pharmaceuticals, pesticides, foods, cosmetics, fine chemicals -Bringing together the views of industry, regulators, academia, NGOs