Can the Predictive Analytics Toolkit (PAT) handle a genomic data set?

IF 3.1 Q2 TOXICOLOGY
Ted W. Simon , Louis A. (Tony) Cox , Richard A. Becker
{"title":"Can the Predictive Analytics Toolkit (PAT) handle a genomic data set?","authors":"Ted W. Simon ,&nbsp;Louis A. (Tony) Cox ,&nbsp;Richard A. Becker","doi":"10.1016/j.comtox.2022.100241","DOIUrl":null,"url":null,"abstract":"<div><p>The Predictive Analytics Toolkit (PAT) was developed to facilitate use of new approach methodologies (NAMs) to predict health hazards and risks from chemicals. PAT is a user-friendly web application that integrates many R packages to enable development and testing of prediction models without any programming. We drew from the work of Ring et al. 2021 (<span>https://doi.org/10.1016/j.comtox.2021.100166)</span><svg><path></path></svg>, who used random forest models to predict <em>in vivo</em> transcriptomic responses in rat liver from <em>in vitro</em> Tox21 AC50 values for a set of 221 chemicals. Gene ontologies helped identify 735 biological pathways based on differential <em>in vivo</em> expression of specific gene sets. Ring et al. used 12 models that varied in use of toxicokinetics to predict <em>in vivo</em> activity using 5000 random forest iterations for each chemical/pathway combination (the area under the receiver-operator characteristic curve (AUC-ROC) was the measure of model performance). The highest-ranking model (Model 10) used Tox21 AC50 nominal concentrations converted to media concentrations and <em>in vivo</em> doses converted to circulating plasma concentrations; the lowest ranking model (Model 2) used nominal <em>in vitro</em> concentrations and administered <em>in vivo</em> dose levels. Using a subset of 10 pathways from the Ring et al. data, we used PAT to predict the AUC-ROC and to compare the best (Model 10) and worst (Model 2) performing models with only 100 random forest iterations. Using the results from PAT, Model 10 “won” in 60% of the comparisons, a value similar to that calculated for the identical set of comparisons using the supplemental data from Ring et al. (52.2%). Hence, PAT can provide a useful alternative to programming in R for prediction modeling and model performance evaluation, even for extensive genomic data sets.</p></div>","PeriodicalId":37651,"journal":{"name":"Computational Toxicology","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468111322000299/pdfft?md5=57556db7f1c9f97e6dd8e33e956d67d5&pid=1-s2.0-S2468111322000299-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Toxicology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468111322000299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TOXICOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The Predictive Analytics Toolkit (PAT) was developed to facilitate use of new approach methodologies (NAMs) to predict health hazards and risks from chemicals. PAT is a user-friendly web application that integrates many R packages to enable development and testing of prediction models without any programming. We drew from the work of Ring et al. 2021 (https://doi.org/10.1016/j.comtox.2021.100166), who used random forest models to predict in vivo transcriptomic responses in rat liver from in vitro Tox21 AC50 values for a set of 221 chemicals. Gene ontologies helped identify 735 biological pathways based on differential in vivo expression of specific gene sets. Ring et al. used 12 models that varied in use of toxicokinetics to predict in vivo activity using 5000 random forest iterations for each chemical/pathway combination (the area under the receiver-operator characteristic curve (AUC-ROC) was the measure of model performance). The highest-ranking model (Model 10) used Tox21 AC50 nominal concentrations converted to media concentrations and in vivo doses converted to circulating plasma concentrations; the lowest ranking model (Model 2) used nominal in vitro concentrations and administered in vivo dose levels. Using a subset of 10 pathways from the Ring et al. data, we used PAT to predict the AUC-ROC and to compare the best (Model 10) and worst (Model 2) performing models with only 100 random forest iterations. Using the results from PAT, Model 10 “won” in 60% of the comparisons, a value similar to that calculated for the identical set of comparisons using the supplemental data from Ring et al. (52.2%). Hence, PAT can provide a useful alternative to programming in R for prediction modeling and model performance evaluation, even for extensive genomic data sets.

预测分析工具包(PAT)能处理基因组数据集吗?
开发预测分析工具包(PAT)是为了促进使用新的方法方法(NAMs)来预测化学品的健康危害和风险。PAT是一个用户友好的web应用程序,它集成了许多R包,可以在没有任何编程的情况下开发和测试预测模型。我们借鉴了Ring等人2021 (https://doi.org/10.1016/j.comtox.2021.100166)的工作,他们使用随机森林模型预测了221种化学物质的体外Tox21 AC50值在大鼠肝脏中的体内转录组反应。基因本体论帮助确定了735种基于体内特定基因组差异表达的生物学途径。Ring等人使用了12种不同的毒性动力学模型,对每种化学物质/途径组合使用5000次随机森林迭代来预测体内活性(接受者-操作者特征曲线(AUC-ROC)下的面积是模型性能的衡量标准)。最高级模型(模型10)将Tox21 AC50名义浓度转换为培养基浓度,体内剂量转换为循环血浆浓度;排名最低的模型(模型2)使用名义体外浓度和体内给药剂量水平。使用来自Ring等人数据的10条路径的子集,我们使用PAT来预测AUC-ROC,并比较只有100次随机森林迭代的最佳(模型10)和最差(模型2)模型。使用PAT的结果,模型10在60%的比较中“胜出”,这一值与使用Ring等人的补充数据对同一组比较计算出的值相似(52.2%)。因此,PAT可以为预测建模和模型性能评估提供一种有用的替代方法,甚至可以用于广泛的基因组数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Toxicology
Computational Toxicology Computer Science-Computer Science Applications
CiteScore
5.50
自引率
0.00%
发文量
53
审稿时长
56 days
期刊介绍: Computational Toxicology is an international journal publishing computational approaches that assist in the toxicological evaluation of new and existing chemical substances assisting in their safety assessment. -All effects relating to human health and environmental toxicity and fate -Prediction of toxicity, metabolism, fate and physico-chemical properties -The development of models from read-across, (Q)SARs, PBPK, QIVIVE, Multi-Scale Models -Big Data in toxicology: integration, management, analysis -Implementation of models through AOPs, IATA, TTC -Regulatory acceptance of models: evaluation, verification and validation -From metals, to small organic molecules to nanoparticles -Pharmaceuticals, pesticides, foods, cosmetics, fine chemicals -Bringing together the views of industry, regulators, academia, NGOs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信