Evaluating structure-based activity in a high-throughput assay for steroid biosynthesis

IF 3.1 Q2 TOXICOLOGY
Miran J Foster , Grace Patlewicz , Imran Shah , Derik E. Haggard , Richard S. Judson , Katie Paul Friedman
{"title":"Evaluating structure-based activity in a high-throughput assay for steroid biosynthesis","authors":"Miran J Foster ,&nbsp;Grace Patlewicz ,&nbsp;Imran Shah ,&nbsp;Derik E. Haggard ,&nbsp;Richard S. Judson ,&nbsp;Katie Paul Friedman","doi":"10.1016/j.comtox.2022.100245","DOIUrl":null,"url":null,"abstract":"<div><p>Data from a high-throughput human adrenocortical carcinoma assay (HT-H295R) for steroid hormone biosynthesis are available for &gt; 2000 chemicals in single concentration and 654 chemicals in multi-concentration (mc). Previously, a metric describing the effect size of a chemical on the biosynthesis of 11 hormones was derived using mc data referred to as the maximum mean Mahalanobis distance (maxmMd). However, mc HT-H295R assay data remain unavailable for many chemicals. This work leverages existing HT-H295R assay data by constructing structure–activity relationships to make predictions for data-poor chemicals, including: (1) identification of individual structural descriptors, known as ToxPrint chemotypes, associated with increased odds of affecting estrogen or androgen synthesis; (2) a random forest (RF) classifier using physicochemical property descriptors to predict HT-H295R maxmMd binary (positive or negative) outcomes; and, (3) a local approach to predict maxmMd binary outcomes using nearest neighbors (NNs) based on two types of chemical fingerprints (chemotype or Morgan). Individual chemotypes demonstrated high specificity (85–98 %) for modulators of estrogen and androgen synthesis but with low sensitivity. The best RF model for maxmMd classification included 13 predicted physicochemical descriptors, yielding a balanced accuracy (BA) of 71 % with only modest improvement when hundreds of structural features were added. The best two NN models for binary maxmMd prediction demonstrated BAs of 85 and 81 % using chemotype and Morgan fingerprints, respectively. Using an external test set of 6302 chemicals (lacking HT-H295R data), 1241 were identified as putative estrogen and androgen modulators. Combined results across the three classification models (global RF model and two local NN models) predict that 1033 of the 6302 chemicals would be more likely to affect HT-H295R bioactivity. Together, these <em>in silico</em> approaches can efficiently prioritize thousands of untested chemicals for screening to further evaluate their effects on steroid biosynthesis.</p></div>","PeriodicalId":37651,"journal":{"name":"Computational Toxicology","volume":"24 ","pages":"Article 100245"},"PeriodicalIF":3.1000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Toxicology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468111322000330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TOXICOLOGY","Score":null,"Total":0}
引用次数: 1

Abstract

Data from a high-throughput human adrenocortical carcinoma assay (HT-H295R) for steroid hormone biosynthesis are available for > 2000 chemicals in single concentration and 654 chemicals in multi-concentration (mc). Previously, a metric describing the effect size of a chemical on the biosynthesis of 11 hormones was derived using mc data referred to as the maximum mean Mahalanobis distance (maxmMd). However, mc HT-H295R assay data remain unavailable for many chemicals. This work leverages existing HT-H295R assay data by constructing structure–activity relationships to make predictions for data-poor chemicals, including: (1) identification of individual structural descriptors, known as ToxPrint chemotypes, associated with increased odds of affecting estrogen or androgen synthesis; (2) a random forest (RF) classifier using physicochemical property descriptors to predict HT-H295R maxmMd binary (positive or negative) outcomes; and, (3) a local approach to predict maxmMd binary outcomes using nearest neighbors (NNs) based on two types of chemical fingerprints (chemotype or Morgan). Individual chemotypes demonstrated high specificity (85–98 %) for modulators of estrogen and androgen synthesis but with low sensitivity. The best RF model for maxmMd classification included 13 predicted physicochemical descriptors, yielding a balanced accuracy (BA) of 71 % with only modest improvement when hundreds of structural features were added. The best two NN models for binary maxmMd prediction demonstrated BAs of 85 and 81 % using chemotype and Morgan fingerprints, respectively. Using an external test set of 6302 chemicals (lacking HT-H295R data), 1241 were identified as putative estrogen and androgen modulators. Combined results across the three classification models (global RF model and two local NN models) predict that 1033 of the 6302 chemicals would be more likely to affect HT-H295R bioactivity. Together, these in silico approaches can efficiently prioritize thousands of untested chemicals for screening to further evaluate their effects on steroid biosynthesis.

在类固醇生物合成的高通量测定中评估基于结构的活性。
类固醇激素生物合成的高通量人类肾上腺皮质癌测定(HT-H295R)数据可用于单浓度>2000种化学物质和多浓度654种化学物质(mc)。以前,描述一种化学物质对11种激素生物合成的影响大小的指标是使用称为最大平均马氏距离(maxmMd)的mc数据得出的。然而,许多化学品的mc-HHT-H295R测定数据仍然不可用。这项工作利用现有的HT-H295R测定数据,通过构建结构-活性关系来预测缺乏数据的化学物质,包括:(1)识别与影响雌激素或雄激素合成的几率增加相关的单个结构描述符,即ToxPrint化学型;(2) 随机森林(RF)分类器,其使用物理化学性质描述符来预测HT-H295R maxmMd二元(阳性或阴性)结果;以及,(3)基于两种类型的化学指纹(化学型或Morgan),使用最近邻(NN)预测maxmMd二元结果的局部方法。个体化学型对雌激素和雄激素合成调节剂表现出高特异性(85-98%),但敏感性低。maxmMd分类的最佳RF模型包括13个预测的物理化学描述符,当添加数百个结构特征时,产生71%的平衡准确度(BA),只有适度的改进。二元maxmMd预测的最佳两个NN模型使用化学型和Morgan指纹分别显示出85%和81%的BA。使用6302种化学物质的外部测试集(缺乏HT-H295R数据),1241种被鉴定为推定的雌激素和雄激素调节剂。三个分类模型(全局RF模型和两个局部NN模型)的综合结果预测,6302种化学物质中的1033种更有可能影响HT-H295R的生物活性。总之,这些计算机方法可以有效地优先筛选数千种未经测试的化学物质,以进一步评估它们对类固醇生物合成的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Toxicology
Computational Toxicology Computer Science-Computer Science Applications
CiteScore
5.50
自引率
0.00%
发文量
53
审稿时长
56 days
期刊介绍: Computational Toxicology is an international journal publishing computational approaches that assist in the toxicological evaluation of new and existing chemical substances assisting in their safety assessment. -All effects relating to human health and environmental toxicity and fate -Prediction of toxicity, metabolism, fate and physico-chemical properties -The development of models from read-across, (Q)SARs, PBPK, QIVIVE, Multi-Scale Models -Big Data in toxicology: integration, management, analysis -Implementation of models through AOPs, IATA, TTC -Regulatory acceptance of models: evaluation, verification and validation -From metals, to small organic molecules to nanoparticles -Pharmaceuticals, pesticides, foods, cosmetics, fine chemicals -Bringing together the views of industry, regulators, academia, NGOs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信