探索混合指纹在广义解读中的使用及其对选定体内毒性结果的预测性能的影响

IF 3.1 Q2 TOXICOLOGY

Computational Toxicology Pub Date : 2025-04-22 DOI:10.1016/j.comtox.2025.100349

Aubrey Leary , Imran Shah , Grace Patlewicz

{"title":"探索混合指纹在广义解读中的使用及其对选定体内毒性结果的预测性能的影响","authors":"Aubrey Leary , Imran Shah , Grace Patlewicz","doi":"10.1016/j.comtox.2025.100349","DOIUrl":null,"url":null,"abstract":"<div><div>Read-across is a cost-efficient means of generating information for hazard assessment. Approaches such as Generalized Read-Across (GenRA) facilitate objective and reproducible read-across for untested substances. GenRA is a web application, and its prediction engine is also available as a python package (genra-py). Recent updates permit source analogues to be identified using ‘hybrid’ fingerprints, i.e. analogues identified based on more than one type of similarity measure. Herein, the performance of hybrid fingerprints relative to Morgan chemical fingerprints was evaluated for a selection of acute and chronic <em>in vivo</em> toxicity outcomes. Grid search and cross-validation on a dataset of 5,830 chemicals with rodent acute oral toxicity (LD<sub>50</sub>) values were used to tune the hybrid weight hyperparameter for up to four chemical fingerprints (Morgan, Torsion, ToxPrint and Analog Identification Methodology (AIM)). The optimal hybrid fingerprint derived (52.12% Morgan, 23.40% ToxPrint, 12.44% AIM, 12.04% Torsion) outperformed Morgan fingerprints across all 10 folds of a cross-validation procedure (mean test set coefficient of determination (R<sup>2</sup>) 0.517 (Morgan) vs. 0.557 (hybrid)). The hybrid fingerprint was then used to make toxicity predictions for 2 other datasets, a set of 3,266 chemicals with oral chronic human equivalent benchmark dose values (mean test set R<sup>2</sup> 0.445 vs. 0.417 for Morgan) and a set of 9,443 chemicals with acute mammalian oral hazard classifications (mean balanced accuracy (BA) 0.577 vs 0.553 for Morgan). Overall, performance improved when using the hybrid fingerprint tuned for the acute toxicity dataset. Using the custom hybrid option in GenRA results in improved read-across predictions relative to current defaults.</div></div>","PeriodicalId":37651,"journal":{"name":"Computational Toxicology","volume":"34 ","pages":"Article 100349"},"PeriodicalIF":3.1000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An exploration of the use of hybrid fingerprints in Generalized Read-Across and their impact on predictive performance for selected in vivo toxicity outcomes\",\"authors\":\"Aubrey Leary , Imran Shah , Grace Patlewicz\",\"doi\":\"10.1016/j.comtox.2025.100349\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Read-across is a cost-efficient means of generating information for hazard assessment. Approaches such as Generalized Read-Across (GenRA) facilitate objective and reproducible read-across for untested substances. GenRA is a web application, and its prediction engine is also available as a python package (genra-py). Recent updates permit source analogues to be identified using ‘hybrid’ fingerprints, i.e. analogues identified based on more than one type of similarity measure. Herein, the performance of hybrid fingerprints relative to Morgan chemical fingerprints was evaluated for a selection of acute and chronic <em>in vivo</em> toxicity outcomes. Grid search and cross-validation on a dataset of 5,830 chemicals with rodent acute oral toxicity (LD<sub>50</sub>) values were used to tune the hybrid weight hyperparameter for up to four chemical fingerprints (Morgan, Torsion, ToxPrint and Analog Identification Methodology (AIM)). The optimal hybrid fingerprint derived (52.12% Morgan, 23.40% ToxPrint, 12.44% AIM, 12.04% Torsion) outperformed Morgan fingerprints across all 10 folds of a cross-validation procedure (mean test set coefficient of determination (R<sup>2</sup>) 0.517 (Morgan) vs. 0.557 (hybrid)). The hybrid fingerprint was then used to make toxicity predictions for 2 other datasets, a set of 3,266 chemicals with oral chronic human equivalent benchmark dose values (mean test set R<sup>2</sup> 0.445 vs. 0.417 for Morgan) and a set of 9,443 chemicals with acute mammalian oral hazard classifications (mean balanced accuracy (BA) 0.577 vs 0.553 for Morgan). Overall, performance improved when using the hybrid fingerprint tuned for the acute toxicity dataset. Using the custom hybrid option in GenRA results in improved read-across predictions relative to current defaults.</div></div>\",\"PeriodicalId\":37651,\"journal\":{\"name\":\"Computational Toxicology\",\"volume\":\"34 \",\"pages\":\"Article 100349\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Toxicology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S246811132500009X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TOXICOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Toxicology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S246811132500009X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TOXICOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

交叉解读是一种成本效益高的产生危害评估信息的方法。通用解读（GenRA）等方法有助于对未测试物质进行客观和可重复的解读。GenRA是一个web应用程序，它的预测引擎也可以作为python包（GenRA -py）获得。最近的更新允许使用“混合”指纹来识别源类似物，即基于一种以上的相似性度量来识别类似物。本文对混合指纹相对于摩根化学指纹的性能进行了评估，以选择急性和慢性体内毒性结果。对5,830种具有啮齿动物急性口服毒性（LD50）值的化学品数据集进行网格搜索和交叉验证，用于调整多达四种化学指纹（Morgan, Torsion， ToxPrint和Analog Identification Methodology (AIM)）的混合权重超参数。优选的混合指纹图谱（52.12% Morgan, 23.40% ToxPrint, 12.44% AIM, 12.04% Torsion）在交叉验证过程的所有10个方面都优于Morgan指纹图谱(平均检验集决定系数（R2） 0.517 （Morgan) vs. 0.557 (hybrid)）。然后使用混合指纹对另外2个数据集进行毒性预测，其中包括3,266种具有口服慢性人体等效基准剂量值的化学物质（平均检验集R2 0.445， Morgan为0.417）和9,443种具有急性哺乳动物口腔危害分类的化学物质（平均平衡精度（BA） 0.577， Morgan为0.553）。总体而言，当使用针对急性毒性数据集进行调优的混合指纹时，性能得到了改善。在GenRA中使用自定义混合选项可以改善相对于当前默认值的读取预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An exploration of the use of hybrid fingerprints in Generalized Read-Across and their impact on predictive performance for selected in vivo toxicity outcomes

Read-across is a cost-efficient means of generating information for hazard assessment. Approaches such as Generalized Read-Across (GenRA) facilitate objective and reproducible read-across for untested substances. GenRA is a web application, and its prediction engine is also available as a python package (genra-py). Recent updates permit source analogues to be identified using ‘hybrid’ fingerprints, i.e. analogues identified based on more than one type of similarity measure. Herein, the performance of hybrid fingerprints relative to Morgan chemical fingerprints was evaluated for a selection of acute and chronic in vivo toxicity outcomes. Grid search and cross-validation on a dataset of 5,830 chemicals with rodent acute oral toxicity (LD₅₀) values were used to tune the hybrid weight hyperparameter for up to four chemical fingerprints (Morgan, Torsion, ToxPrint and Analog Identification Methodology (AIM)). The optimal hybrid fingerprint derived (52.12% Morgan, 23.40% ToxPrint, 12.44% AIM, 12.04% Torsion) outperformed Morgan fingerprints across all 10 folds of a cross-validation procedure (mean test set coefficient of determination (R²) 0.517 (Morgan) vs. 0.557 (hybrid)). The hybrid fingerprint was then used to make toxicity predictions for 2 other datasets, a set of 3,266 chemicals with oral chronic human equivalent benchmark dose values (mean test set R² 0.445 vs. 0.417 for Morgan) and a set of 9,443 chemicals with acute mammalian oral hazard classifications (mean balanced accuracy (BA) 0.577 vs 0.553 for Morgan). Overall, performance improved when using the hybrid fingerprint tuned for the acute toxicity dataset. Using the custom hybrid option in GenRA results in improved read-across predictions relative to current defaults.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computational Toxicology Computer Science-Computer Science Applications

CiteScore

5.50

自引率

0.00%

发文量

审稿时长

56 days

期刊介绍： Computational Toxicology is an international journal publishing computational approaches that assist in the toxicological evaluation of new and existing chemical substances assisting in their safety assessment. -All effects relating to human health and environmental toxicity and fate -Prediction of toxicity, metabolism, fate and physico-chemical properties -The development of models from read-across, (Q)SARs, PBPK, QIVIVE, Multi-Scale Models -Big Data in toxicology: integration, management, analysis -Implementation of models through AOPs, IATA, TTC -Regulatory acceptance of models: evaluation, verification and validation -From metals, to small organic molecules to nanoparticles -Pharmaceuticals, pesticides, foods, cosmetics, fine chemicals -Bringing together the views of industry, regulators, academia, NGOs