模拟识别方法(AIM)片段的CSRML版本的开发及其在泛化跨读(GenRA)方法中的评估

IF 3.1 Q2 TOXICOLOGY
Matthew Adams , Hannah Hidle , Daniel Chang , Ann M. Richard , Antony J. Williams , Imran Shah , Grace Patlewicz
{"title":"模拟识别方法(AIM)片段的CSRML版本的开发及其在泛化跨读(GenRA)方法中的评估","authors":"Matthew Adams ,&nbsp;Hannah Hidle ,&nbsp;Daniel Chang ,&nbsp;Ann M. Richard ,&nbsp;Antony J. Williams ,&nbsp;Imran Shah ,&nbsp;Grace Patlewicz","doi":"10.1016/j.comtox.2022.100256","DOIUrl":null,"url":null,"abstract":"<div><p>The Analog Identification Methodology (AIM) was developed over 20 years ago to identify analogues to support read-across at the US Environmental Protection Agency. However, the current public version of the standalone tool, released in 2012, is no longer usable on Windows operating systems supported by Microsoft. Additionally, the structural logic for analogue selection is based on older, customised Simplified molecular-input-line-entry system (SMILES)-type features that are incompatible with modern cheminformatics tools. Given these limitations, a case study was undertaken to explore a more transparent, extensible method of implementing the AIM fragments using Chemical Subgraphs and Reactions Mark-up Language (CSRML). A CSRML file was developed to codify the original AIM fragments, and the extent to which AIM fragments were faithfully replicated was assessed using the AIM Database. The overall mean performance of the CSRML-AIM across all fragments in terms of sensitivity, specificity, and Jaccard similarity was 89.5%, 99.9%, and 82.2%, respectively. Comparing the AIM fragments with public ToxPrints using a large set of ∼25,000 substances of regulatory interest to EPA found them to be dissimilar, with an average maximum Jaccard score of 0.24 for AIM and 0.29 for ToxPrint fingerprints. Both fragment sets were then used as inputs in the automated read-across approach, Generalised Read-Across (GenRA), to evaluate the quality of fit in predicting rat acute oral toxicity LD<sub>50</sub> values with the coefficient of determination (R<sup>2</sup>) and root mean squared error (RMSE). The performance of AIM fragments was R<sup>2</sup>=0.434 and RMSE=0.663 whereas that of ToxPrints was R<sup>2</sup>=0.477 and RMSE=0.638. A bootstrap resampling using 100 iterations found the mean and the 95th confidence interval of R<sup>2</sup> to be 0.349 [0.319, 0.379] for AIM fragments and 0.377 [0.338, 0.412] for ToxPrints. Although AIM and ToxPrints performed similarly in predicting LD<sub>50,</sub> they differed in their performance at a local level, revealing that their features can offer complementary insights.</p></div>","PeriodicalId":37651,"journal":{"name":"Computational Toxicology","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9888031/pdf/","citationCount":"1","resultStr":"{\"title\":\"Development of a CSRML version of the Analog identification Methodology (AIM) fragments and their evaluation within the Generalised Read-Across (GenRA) approach\",\"authors\":\"Matthew Adams ,&nbsp;Hannah Hidle ,&nbsp;Daniel Chang ,&nbsp;Ann M. Richard ,&nbsp;Antony J. Williams ,&nbsp;Imran Shah ,&nbsp;Grace Patlewicz\",\"doi\":\"10.1016/j.comtox.2022.100256\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The Analog Identification Methodology (AIM) was developed over 20 years ago to identify analogues to support read-across at the US Environmental Protection Agency. However, the current public version of the standalone tool, released in 2012, is no longer usable on Windows operating systems supported by Microsoft. Additionally, the structural logic for analogue selection is based on older, customised Simplified molecular-input-line-entry system (SMILES)-type features that are incompatible with modern cheminformatics tools. Given these limitations, a case study was undertaken to explore a more transparent, extensible method of implementing the AIM fragments using Chemical Subgraphs and Reactions Mark-up Language (CSRML). A CSRML file was developed to codify the original AIM fragments, and the extent to which AIM fragments were faithfully replicated was assessed using the AIM Database. The overall mean performance of the CSRML-AIM across all fragments in terms of sensitivity, specificity, and Jaccard similarity was 89.5%, 99.9%, and 82.2%, respectively. Comparing the AIM fragments with public ToxPrints using a large set of ∼25,000 substances of regulatory interest to EPA found them to be dissimilar, with an average maximum Jaccard score of 0.24 for AIM and 0.29 for ToxPrint fingerprints. Both fragment sets were then used as inputs in the automated read-across approach, Generalised Read-Across (GenRA), to evaluate the quality of fit in predicting rat acute oral toxicity LD<sub>50</sub> values with the coefficient of determination (R<sup>2</sup>) and root mean squared error (RMSE). The performance of AIM fragments was R<sup>2</sup>=0.434 and RMSE=0.663 whereas that of ToxPrints was R<sup>2</sup>=0.477 and RMSE=0.638. A bootstrap resampling using 100 iterations found the mean and the 95th confidence interval of R<sup>2</sup> to be 0.349 [0.319, 0.379] for AIM fragments and 0.377 [0.338, 0.412] for ToxPrints. Although AIM and ToxPrints performed similarly in predicting LD<sub>50,</sub> they differed in their performance at a local level, revealing that their features can offer complementary insights.</p></div>\",\"PeriodicalId\":37651,\"journal\":{\"name\":\"Computational Toxicology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2023-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9888031/pdf/\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Toxicology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2468111322000445\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TOXICOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Toxicology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468111322000445","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TOXICOLOGY","Score":null,"Total":0}
引用次数: 1

摘要

模拟物识别方法(AIM)是在20多年前开发的,用于识别类似物,以支持美国环境保护署的读取。但是,该独立工具的当前公开版本(2012年发布)已无法在微软支持的Windows操作系统上使用。此外,模拟物选择的结构逻辑是基于旧的,定制的简化分子输入行输入系统(SMILES)类型的特征,与现代化学信息学工具不兼容。考虑到这些限制,我们进行了一个案例研究,探索一种使用化学子图和反应标记语言(CSRML)实现AIM片段的更透明、可扩展的方法。开发了一个CSRML文件来对原始AIM片段进行编码,并使用AIM数据库评估AIM片段被忠实复制的程度。CSRML-AIM在所有片段的敏感性、特异性和Jaccard相似性方面的总体平均表现分别为89.5%、99.9%和82.2%。将AIM片段与公共ToxPrints进行比较,使用大量的约25,000种对EPA具有监管意义的物质,发现它们是不同的,AIM和ToxPrint指纹的平均最大Jaccard分数分别为0.24和0.29。然后将这两个片段集用作自动读取方法的输入,即广义读取(GenRA),以确定系数(R2)和均方根误差(RMSE)评估预测大鼠急性口服毒性LD50值的拟合质量。AIM片段的检测效能R2=0.434, RMSE=0.663, ToxPrints的检测效能R2=0.477, RMSE=0.638。使用100次迭代的bootstrap重采样发现,AIM片段的R2均值和第95可信区间为0.349 [0.319,0.379],ToxPrints的R2均值和可信区间为0.377[0.338,0.412]。尽管AIM和ToxPrints在预测LD50方面表现相似,但它们在局部水平上的表现不同,这表明它们的特征可以提供互补的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Development of a CSRML version of the Analog identification Methodology (AIM) fragments and their evaluation within the Generalised Read-Across (GenRA) approach

The Analog Identification Methodology (AIM) was developed over 20 years ago to identify analogues to support read-across at the US Environmental Protection Agency. However, the current public version of the standalone tool, released in 2012, is no longer usable on Windows operating systems supported by Microsoft. Additionally, the structural logic for analogue selection is based on older, customised Simplified molecular-input-line-entry system (SMILES)-type features that are incompatible with modern cheminformatics tools. Given these limitations, a case study was undertaken to explore a more transparent, extensible method of implementing the AIM fragments using Chemical Subgraphs and Reactions Mark-up Language (CSRML). A CSRML file was developed to codify the original AIM fragments, and the extent to which AIM fragments were faithfully replicated was assessed using the AIM Database. The overall mean performance of the CSRML-AIM across all fragments in terms of sensitivity, specificity, and Jaccard similarity was 89.5%, 99.9%, and 82.2%, respectively. Comparing the AIM fragments with public ToxPrints using a large set of ∼25,000 substances of regulatory interest to EPA found them to be dissimilar, with an average maximum Jaccard score of 0.24 for AIM and 0.29 for ToxPrint fingerprints. Both fragment sets were then used as inputs in the automated read-across approach, Generalised Read-Across (GenRA), to evaluate the quality of fit in predicting rat acute oral toxicity LD50 values with the coefficient of determination (R2) and root mean squared error (RMSE). The performance of AIM fragments was R2=0.434 and RMSE=0.663 whereas that of ToxPrints was R2=0.477 and RMSE=0.638. A bootstrap resampling using 100 iterations found the mean and the 95th confidence interval of R2 to be 0.349 [0.319, 0.379] for AIM fragments and 0.377 [0.338, 0.412] for ToxPrints. Although AIM and ToxPrints performed similarly in predicting LD50, they differed in their performance at a local level, revealing that their features can offer complementary insights.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computational Toxicology
Computational Toxicology Computer Science-Computer Science Applications
CiteScore
5.50
自引率
0.00%
发文量
53
审稿时长
56 days
期刊介绍: Computational Toxicology is an international journal publishing computational approaches that assist in the toxicological evaluation of new and existing chemical substances assisting in their safety assessment. -All effects relating to human health and environmental toxicity and fate -Prediction of toxicity, metabolism, fate and physico-chemical properties -The development of models from read-across, (Q)SARs, PBPK, QIVIVE, Multi-Scale Models -Big Data in toxicology: integration, management, analysis -Implementation of models through AOPs, IATA, TTC -Regulatory acceptance of models: evaluation, verification and validation -From metals, to small organic molecules to nanoparticles -Pharmaceuticals, pesticides, foods, cosmetics, fine chemicals -Bringing together the views of industry, regulators, academia, NGOs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信