PretoxTM: a text mining system for extracting treatment-related findings from preclinical toxicology reports

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics Pub Date : 2025-02-03 DOI:10.1186/s13321-024-00925-x

Javier Corvi, Nicolás Díaz-Roussel, José M. Fernández, Francesco Ronzano, Emilio Centeno, Pablo Accuosto, Celine Ibrahim, Shoji Asakura, Frank Bringezu, Mirjam Fröhlicher, Annika Kreuchwig, Yoko Nogami, Jeong Rih, Raul Rodriguez-Esteban, Nicolas Sajot, Joerg Wichard, Heng-Yi Michael Wu, Philip Drew, Thomas Steger-Hartmann, Alfonso Valencia, Laura I. Furlong, Salvador Capella-Gutierrez

{"title":"PretoxTM: a text mining system for extracting treatment-related findings from preclinical toxicology reports","authors":"Javier Corvi, Nicolás Díaz-Roussel, José M. Fernández, Francesco Ronzano, Emilio Centeno, Pablo Accuosto, Celine Ibrahim, Shoji Asakura, Frank Bringezu, Mirjam Fröhlicher, Annika Kreuchwig, Yoko Nogami, Jeong Rih, Raul Rodriguez-Esteban, Nicolas Sajot, Joerg Wichard, Heng-Yi Michael Wu, Philip Drew, Thomas Steger-Hartmann, Alfonso Valencia, Laura I. Furlong, Salvador Capella-Gutierrez","doi":"10.1186/s13321-024-00925-x","DOIUrl":null,"url":null,"abstract":"<div><p>Over the last few decades the pharmaceutical industry has generated a vast corpus of knowledge on the safety and efficacy of drugs. Much of this information is contained in toxicology reports, which summarise the results of animal studies designed to analyse the effects of the tested compound, including unintended pharmacological and toxic effects, known as treatment-related findings. Despite the potential of this knowledge, the fact that most of this relevant information is only available as unstructured text with variable degrees of digitisation has hampered its systematic access, use and exploitation. Text mining technologies have the ability to automatically extract, analyse and aggregate such information, providing valuable new insights into the drug discovery and development process. In the context of the eTRANSAFE project, we present PretoxTM (Preclinical Toxicology Text Mining), the first system specifically designed to detect, extract, organise and visualise treatment-related findings from toxicology reports. The PretoxTM tool comprises three main components: PretoxTM Corpus, PretoxTM Pipeline and PretoxTM Web App. The PretoxTM Corpus is a gold standard corpus of preclinical treatment-related findings annotated by toxicology experts. This corpus was used to develop, train and validate the PretoxTM Pipeline, which extracts treatment-related findings from preclinical study reports. The extracted information is then presented for expert visualisation and validation in the PretoxTM Web App.</p><p><b>Scientific Contribution</b></p><p>While text mining solutions have been widely used in the clinical domain to identify adverse drug reactions from various sources, no similar systems exist for identifying adverse events in animal models during preclinical testing. PretoxTM fills this gap by efficiently extracting treatment-related findings from preclinical toxicology reports. This provides a valuable resource for toxicology research, enhancing the efficiency of safety evaluations, saving time, and leading to more effective decision-making in the drug development process.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00925-x","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-024-00925-x","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Over the last few decades the pharmaceutical industry has generated a vast corpus of knowledge on the safety and efficacy of drugs. Much of this information is contained in toxicology reports, which summarise the results of animal studies designed to analyse the effects of the tested compound, including unintended pharmacological and toxic effects, known as treatment-related findings. Despite the potential of this knowledge, the fact that most of this relevant information is only available as unstructured text with variable degrees of digitisation has hampered its systematic access, use and exploitation. Text mining technologies have the ability to automatically extract, analyse and aggregate such information, providing valuable new insights into the drug discovery and development process. In the context of the eTRANSAFE project, we present PretoxTM (Preclinical Toxicology Text Mining), the first system specifically designed to detect, extract, organise and visualise treatment-related findings from toxicology reports. The PretoxTM tool comprises three main components: PretoxTM Corpus, PretoxTM Pipeline and PretoxTM Web App. The PretoxTM Corpus is a gold standard corpus of preclinical treatment-related findings annotated by toxicology experts. This corpus was used to develop, train and validate the PretoxTM Pipeline, which extracts treatment-related findings from preclinical study reports. The extracted information is then presented for expert visualisation and validation in the PretoxTM Web App.

Scientific Contribution

While text mining solutions have been widely used in the clinical domain to identify adverse drug reactions from various sources, no similar systems exist for identifying adverse events in animal models during preclinical testing. PretoxTM fills this gap by efficiently extracting treatment-related findings from preclinical toxicology reports. This provides a valuable resource for toxicology research, enhancing the efficiency of safety evaluations, saving time, and leading to more effective decision-making in the drug development process.

查看原文本刊更多论文

PretoxTM：从临床前毒理学报告中提取治疗相关发现的文本挖掘系统

在过去的几十年里，制药行业已经产生了大量关于药物安全性和有效性的知识。这些信息大多包含在毒理学报告中，这些报告总结了动物研究的结果，旨在分析所测试化合物的影响，包括意想不到的药理和毒性作用，即与治疗相关的发现。尽管这些知识具有潜力，但大多数相关信息只能作为非结构化文本以不同程度的数字化提供，这一事实阻碍了其系统访问、使用和开发。文本挖掘技术能够自动提取、分析和汇总这些信息，为药物发现和开发过程提供有价值的新见解。在eTRANSAFE项目的背景下，我们提出了PretoxTM（临床前毒理学文本挖掘），这是第一个专门用于从毒理学报告中检测、提取、组织和可视化治疗相关发现的系统。PretoxTM工具包括三个主要组成部分：PretoxTM语料库、PretoxTM管道和PretoxTM Web App。PretoxTM语料库是由毒理学专家注释的临床前治疗相关发现的金标准语料库。该语料库用于开发、培训和验证PretoxTM管道，该管道从临床前研究报告中提取与治疗相关的发现。提取的信息然后在PretoxTM Web应用程序中呈现给专家可视化和验证。科学贡献虽然文本挖掘解决方案已广泛用于临床领域，以识别来自各种来源的药物不良反应，但在临床前测试期间尚无类似的系统用于识别动物模型中的不良事件。PretoxTM通过有效地从临床前毒理学报告中提取与治疗相关的发现来填补这一空白。这为毒理学研究提供了宝贵的资源，提高了安全性评估的效率，节省了时间，并导致药物开发过程中更有效的决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

14.10

自引率

7.00%

发文量

审稿时长

3 months

期刊介绍： Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.