Artificial intelligence-based data extraction for next generation risk assessment: Is fine-tuning of a large language model worth the effort?

IF 4.8 3区 医学 Q1 PHARMACOLOGY & PHARMACY
{"title":"Artificial intelligence-based data extraction for next generation risk assessment: Is fine-tuning of a large language model worth the effort?","authors":"","doi":"10.1016/j.tox.2024.153933","DOIUrl":null,"url":null,"abstract":"<div><p>To underpin scientific evaluations of chemical risks, agencies such as the European Food Safety Authority (EFSA) heavily rely on the outcome of systematic reviews, which currently require extensive manual effort. One specific challenge constitutes the meaningful use of vast amounts of valuable data from new approach methodologies (NAMs) which are mostly reported in an unstructured way in the scientific literature. In the EFSA-initiated project ‘AI4NAMS’, the potential of large language models (LLMs) was explored. Models from the GPT family, where GPT refers to Generative Pre-trained Transformer, were used for searching, extracting, and integrating data from scientific publications for NAM-based risk assessment. A case study on bisphenol A (BPA), a substance of very high concern due to its adverse effects on human health, focused on the structured extraction of information on test systems measuring biologic activities of BPA. Fine-tuning of a GPT-3 model (<em>Curie</em> base model) for extraction tasks was tested and the performance of the fine-tuned model was compared to the performance of a ready-to-use model (<em>text-davinci-002</em>). To update findings from the AI4NAMS project and to check for technical progress, the fine-tuning exercise was repeated and a newer ready-to-use model (<em>text-davinci-003)</em> served as comparison. In both cases, the fine-tuned <em>Curie</em> model was found to be superior to the ready-to-use model. Performance improvement was also obvious between <em>text-davinci-002</em> and the newer <em>text-davinci-003</em>. Our findings demonstrate how fine-tuning and the swift general technical development improve model performance and contribute to the growing number of investigations on the use of AI in scientific and regulatory tasks.</p></div>","PeriodicalId":23159,"journal":{"name":"Toxicology","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0300483X24002142/pdfft?md5=06ab65bd995796906967d17e831bbbda&pid=1-s2.0-S0300483X24002142-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Toxicology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0300483X24002142","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

Abstract

To underpin scientific evaluations of chemical risks, agencies such as the European Food Safety Authority (EFSA) heavily rely on the outcome of systematic reviews, which currently require extensive manual effort. One specific challenge constitutes the meaningful use of vast amounts of valuable data from new approach methodologies (NAMs) which are mostly reported in an unstructured way in the scientific literature. In the EFSA-initiated project ‘AI4NAMS’, the potential of large language models (LLMs) was explored. Models from the GPT family, where GPT refers to Generative Pre-trained Transformer, were used for searching, extracting, and integrating data from scientific publications for NAM-based risk assessment. A case study on bisphenol A (BPA), a substance of very high concern due to its adverse effects on human health, focused on the structured extraction of information on test systems measuring biologic activities of BPA. Fine-tuning of a GPT-3 model (Curie base model) for extraction tasks was tested and the performance of the fine-tuned model was compared to the performance of a ready-to-use model (text-davinci-002). To update findings from the AI4NAMS project and to check for technical progress, the fine-tuning exercise was repeated and a newer ready-to-use model (text-davinci-003) served as comparison. In both cases, the fine-tuned Curie model was found to be superior to the ready-to-use model. Performance improvement was also obvious between text-davinci-002 and the newer text-davinci-003. Our findings demonstrate how fine-tuning and the swift general technical development improve model performance and contribute to the growing number of investigations on the use of AI in scientific and regulatory tasks.

基于人工智能的下一代风险评估数据提取:对大型语言模型进行微调值得吗?
为了支持对化学品风险的科学评估,欧洲食品安全局(EFSA)等机构在很大程度上依赖于系统性审查的结果,而这些审查目前需要大量的人工工作。其中一个具体挑战是如何有效利用新方法(NAM)中的大量宝贵数据,这些数据在科学文献中大多以非结构化的方式进行报告。在欧洲食品安全局发起的 "AI4NAMS "项目中,探索了大型语言模型(LLM)的潜力。GPT 系列(GPT 指生成预训练转换器)中的模型被用于搜索、提取和整合科学出版物中的数据,以进行基于 NAM 的风险评估。双酚 A(BPA)因其对人类健康的不利影响而备受关注,案例研究的重点是结构化提取有关测量双酚 A 生物活性的测试系统的信息。针对提取任务对 GPT-3 模型(居里基本模型)进行了微调测试,并将微调模型的性能与即用模型(text-davinci-002)的性能进行了比较。为了更新 AI4NAMS 项目的研究结果并检查技术进步情况,我们重复了微调工作,并将较新的即用模型(text-davinci-003)作为对比。在这两种情况下,微调后的居里模型都优于即用模型。text-davinci-002 和较新的 text-davinci-003 之间的性能改进也很明显。我们的研究结果表明了微调和快速的总体技术发展是如何提高模型性能的,并为越来越多的关于在科学和监管任务中使用人工智能的研究做出了贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Toxicology
Toxicology 医学-毒理学
CiteScore
7.80
自引率
4.40%
发文量
222
审稿时长
23 days
期刊介绍: Toxicology is an international, peer-reviewed journal that publishes only the highest quality original scientific research and critical reviews describing hypothesis-based investigations into mechanisms of toxicity associated with exposures to xenobiotic chemicals, particularly as it relates to human health. In this respect "mechanisms" is defined on both the macro (e.g. physiological, biological, kinetic, species, sex, etc.) and molecular (genomic, transcriptomic, metabolic, etc.) scale. Emphasis is placed on findings that identify novel hazards and that can be extrapolated to exposures and mechanisms that are relevant to estimating human risk. Toxicology also publishes brief communications, personal commentaries and opinion articles, as well as concise expert reviews on contemporary topics. All research and review articles published in Toxicology are subject to rigorous peer review. Authors are asked to contact the Editor-in-Chief prior to submitting review articles or commentaries for consideration for publication in Toxicology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信