An exploration of the use of hybrid fingerprints in Generalized Read-Across and their impact on predictive performance for selected in vivo toxicity outcomes

IF 3.1 Q2 TOXICOLOGY
Aubrey Leary , Imran Shah , Grace Patlewicz
{"title":"An exploration of the use of hybrid fingerprints in Generalized Read-Across and their impact on predictive performance for selected in vivo toxicity outcomes","authors":"Aubrey Leary ,&nbsp;Imran Shah ,&nbsp;Grace Patlewicz","doi":"10.1016/j.comtox.2025.100349","DOIUrl":null,"url":null,"abstract":"<div><div>Read-across is a cost-efficient means of generating information for hazard assessment. Approaches such as Generalized Read-Across (GenRA) facilitate objective and reproducible read-across for untested substances. GenRA is a web application, and its prediction engine is also available as a python package (genra-py). Recent updates permit source analogues to be identified using ‘hybrid’ fingerprints, i.e. analogues identified based on more than one type of similarity measure. Herein, the performance of hybrid fingerprints relative to Morgan chemical fingerprints was evaluated for a selection of acute and chronic <em>in vivo</em> toxicity outcomes. Grid search and cross-validation on a dataset of 5,830 chemicals with rodent acute oral toxicity (LD<sub>50</sub>) values were used to tune the hybrid weight hyperparameter for up to four chemical fingerprints (Morgan, Torsion, ToxPrint and Analog Identification Methodology (AIM)). The optimal hybrid fingerprint derived (52.12% Morgan, 23.40% ToxPrint, 12.44% AIM, 12.04% Torsion) outperformed Morgan fingerprints across all 10 folds of a cross-validation procedure (mean test set coefficient of determination (R<sup>2</sup>) 0.517 (Morgan) vs. 0.557 (hybrid)). The hybrid fingerprint was then used to make toxicity predictions for 2 other datasets, a set of 3,266 chemicals with oral chronic human equivalent benchmark dose values (mean test set R<sup>2</sup> 0.445 vs. 0.417 for Morgan) and a set of 9,443 chemicals with acute mammalian oral hazard classifications (mean balanced accuracy (BA) 0.577 vs 0.553 for Morgan). Overall, performance improved when using the hybrid fingerprint tuned for the acute toxicity dataset. Using the custom hybrid option in GenRA results in improved read-across predictions relative to current defaults.</div></div>","PeriodicalId":37651,"journal":{"name":"Computational Toxicology","volume":"34 ","pages":"Article 100349"},"PeriodicalIF":3.1000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Toxicology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S246811132500009X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TOXICOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Read-across is a cost-efficient means of generating information for hazard assessment. Approaches such as Generalized Read-Across (GenRA) facilitate objective and reproducible read-across for untested substances. GenRA is a web application, and its prediction engine is also available as a python package (genra-py). Recent updates permit source analogues to be identified using ‘hybrid’ fingerprints, i.e. analogues identified based on more than one type of similarity measure. Herein, the performance of hybrid fingerprints relative to Morgan chemical fingerprints was evaluated for a selection of acute and chronic in vivo toxicity outcomes. Grid search and cross-validation on a dataset of 5,830 chemicals with rodent acute oral toxicity (LD50) values were used to tune the hybrid weight hyperparameter for up to four chemical fingerprints (Morgan, Torsion, ToxPrint and Analog Identification Methodology (AIM)). The optimal hybrid fingerprint derived (52.12% Morgan, 23.40% ToxPrint, 12.44% AIM, 12.04% Torsion) outperformed Morgan fingerprints across all 10 folds of a cross-validation procedure (mean test set coefficient of determination (R2) 0.517 (Morgan) vs. 0.557 (hybrid)). The hybrid fingerprint was then used to make toxicity predictions for 2 other datasets, a set of 3,266 chemicals with oral chronic human equivalent benchmark dose values (mean test set R2 0.445 vs. 0.417 for Morgan) and a set of 9,443 chemicals with acute mammalian oral hazard classifications (mean balanced accuracy (BA) 0.577 vs 0.553 for Morgan). Overall, performance improved when using the hybrid fingerprint tuned for the acute toxicity dataset. Using the custom hybrid option in GenRA results in improved read-across predictions relative to current defaults.
探索混合指纹在广义解读中的使用及其对选定体内毒性结果的预测性能的影响
交叉解读是一种成本效益高的产生危害评估信息的方法。通用解读(GenRA)等方法有助于对未测试物质进行客观和可重复的解读。GenRA是一个web应用程序,它的预测引擎也可以作为python包(GenRA -py)获得。最近的更新允许使用“混合”指纹来识别源类似物,即基于一种以上的相似性度量来识别类似物。本文对混合指纹相对于摩根化学指纹的性能进行了评估,以选择急性和慢性体内毒性结果。对5,830种具有啮齿动物急性口服毒性(LD50)值的化学品数据集进行网格搜索和交叉验证,用于调整多达四种化学指纹(Morgan, Torsion, ToxPrint和Analog Identification Methodology (AIM))的混合权重超参数。优选的混合指纹图谱(52.12% Morgan, 23.40% ToxPrint, 12.44% AIM, 12.04% Torsion)在交叉验证过程的所有10个方面都优于Morgan指纹图谱(平均检验集决定系数(R2) 0.517 (Morgan) vs. 0.557 (hybrid))。然后使用混合指纹对另外2个数据集进行毒性预测,其中包括3,266种具有口服慢性人体等效基准剂量值的化学物质(平均检验集R2 0.445, Morgan为0.417)和9,443种具有急性哺乳动物口腔危害分类的化学物质(平均平衡精度(BA) 0.577, Morgan为0.553)。总体而言,当使用针对急性毒性数据集进行调优的混合指纹时,性能得到了改善。在GenRA中使用自定义混合选项可以改善相对于当前默认值的读取预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Toxicology
Computational Toxicology Computer Science-Computer Science Applications
CiteScore
5.50
自引率
0.00%
发文量
53
审稿时长
56 days
期刊介绍: Computational Toxicology is an international journal publishing computational approaches that assist in the toxicological evaluation of new and existing chemical substances assisting in their safety assessment. -All effects relating to human health and environmental toxicity and fate -Prediction of toxicity, metabolism, fate and physico-chemical properties -The development of models from read-across, (Q)SARs, PBPK, QIVIVE, Multi-Scale Models -Big Data in toxicology: integration, management, analysis -Implementation of models through AOPs, IATA, TTC -Regulatory acceptance of models: evaluation, verification and validation -From metals, to small organic molecules to nanoparticles -Pharmaceuticals, pesticides, foods, cosmetics, fine chemicals -Bringing together the views of industry, regulators, academia, NGOs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信