Semantic Search of FDA Guidance Documents Using Generative AI.

IF 1.9 4区医学 Q4 MEDICAL INFORMATICS

Therapeutic innovation & regulatory science Pub Date : 2025-09-01 Epub Date: 2025-06-14 DOI:10.1007/s43441-025-00798-8

Scott Proestel, Linda J B Jeng, Christopher Smith, Matthew Deady, Omar Amer, Mohamed Ahmed, Sarah Rodgers

{"title":"Semantic Search of FDA Guidance Documents Using Generative AI.","authors":"Scott Proestel, Linda J B Jeng, Christopher Smith, Matthew Deady, Omar Amer, Mohamed Ahmed, Sarah Rodgers","doi":"10.1007/s43441-025-00798-8","DOIUrl":null,"url":null,"abstract":"Introduction: Generative artificial intelligence (AI) has the potential to transform and accelerate how information is accessed during the regulation of human drug and biologic products.Objectives: Determine whether a generative AI-supported application with retrieval-augmented generation (RAG) architecture can be used to correctly answer questions about the information contained in FDA guidance documents.Methods: Five large language models (LLMs): Flan-UL2, GPT-3.5 Turbo, GPT-4 Turbo, Granite, and Llama 2, were evaluated in conjunction with the RAG application Golden Retriever to assess their ability to answer questions about the information contained in clinically oriented FDA guidance documents. Models were configured to precise mode with a low temperature parameter setting to generate precise, non-creative answers, ensuring reliable clinical regulatory review guidance for users.Results: During preliminary testing, GPT-4 Turbo was the highest performing LLM. It was therefore selected for additional evaluation where it generated a correct response with additional helpful information 33.9% of the time, a correct response 35.7% of the time, a response with some of the required correct information 17.0% of the time, and a response with any incorrect information 13.4% of the time. The RAG application was able to cite the correct source document 89.2% of the time.Conclusion: The ability of the generative AI application to identify the correct guidance document and answer questions could significantly reduce the time in finding the correct answer for questions about FDA guidance documents. However, as the information in FDA guidance documents may be relied on by sponsors and FDA staff to guide important drug development decisions, the use of incorrect information could have a significantly negative impact on the drug development process. Based on our results, the correct citation documents can be used to reduce the time in finding the correct document that contains the information, but further research into the refinement of generative AI will likely be required before this tool can be relied on to answer questions about information contained in FDA guidance documents. Rephrasing questions by including additional context information, reconfiguring the embedding and chunking parameters, and other prompt engineering techniques may improve the rate of fully correct and complete responses.","PeriodicalId":23084,"journal":{"name":"Therapeutic innovation & regulatory science","volume":" ","pages":"1148-1159"},"PeriodicalIF":1.9000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12446095/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Therapeutic innovation & regulatory science","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s43441-025-00798-8","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/14 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Generative artificial intelligence (AI) has the potential to transform and accelerate how information is accessed during the regulation of human drug and biologic products.

Objectives: Determine whether a generative AI-supported application with retrieval-augmented generation (RAG) architecture can be used to correctly answer questions about the information contained in FDA guidance documents.

Methods: Five large language models (LLMs): Flan-UL2, GPT-3.5 Turbo, GPT-4 Turbo, Granite, and Llama 2, were evaluated in conjunction with the RAG application Golden Retriever to assess their ability to answer questions about the information contained in clinically oriented FDA guidance documents. Models were configured to precise mode with a low temperature parameter setting to generate precise, non-creative answers, ensuring reliable clinical regulatory review guidance for users.

Results: During preliminary testing, GPT-4 Turbo was the highest performing LLM. It was therefore selected for additional evaluation where it generated a correct response with additional helpful information 33.9% of the time, a correct response 35.7% of the time, a response with some of the required correct information 17.0% of the time, and a response with any incorrect information 13.4% of the time. The RAG application was able to cite the correct source document 89.2% of the time.

Conclusion: The ability of the generative AI application to identify the correct guidance document and answer questions could significantly reduce the time in finding the correct answer for questions about FDA guidance documents. However, as the information in FDA guidance documents may be relied on by sponsors and FDA staff to guide important drug development decisions, the use of incorrect information could have a significantly negative impact on the drug development process. Based on our results, the correct citation documents can be used to reduce the time in finding the correct document that contains the information, but further research into the refinement of generative AI will likely be required before this tool can be relied on to answer questions about information contained in FDA guidance documents. Rephrasing questions by including additional context information, reconfiguring the embedding and chunking parameters, and other prompt engineering techniques may improve the rate of fully correct and complete responses.

Abstract Image

查看原文本刊更多论文

基于生成式人工智能的FDA指导文件语义搜索

导读：生成式人工智能（AI）有可能改变和加速人类药物和生物制品监管过程中信息的获取方式。目的：确定具有检索增强生成（RAG）架构的生成式ai支持应用程序是否可用于正确回答有关FDA指导文件中包含的信息的问题。方法：对Flan-UL2、GPT-3.5 Turbo、GPT-4 Turbo、Granite和Llama 2这5种大型语言模型（LLMs）与RAG应用金毛猎犬（Golden Retriever）一起进行评估，以评估它们回答临床导向FDA指导文件中信息问题的能力。模型配置为精确模式，低温参数设置，以产生精确的，非创造性的答案，确保为用户提供可靠的临床监管审查指导。结果：在初步测试中，GPT-4 Turbo是性能最高的LLM。因此，它被选中进行额外的评估，其中它产生了一个正确的回答与额外的有用信息33.9%的时间，一个正确的回答35.7%的时间，一个响应与一些所需的正确信息17.0%的时间，和任何不正确的信息的响应13.4%的时间。RAG应用程序能够在89.2%的时间内引用正确的源文档。结论：生成式人工智能应用识别正确的指导文件和回答问题的能力可以显著减少FDA指导文件问题的正确答案的寻找时间。然而，由于申办者和FDA工作人员可能依赖FDA指导文件中的信息来指导重要的药物开发决策，因此使用错误的信息可能会对药物开发过程产生重大的负面影响。根据我们的研究结果，正确的引文文件可以用来减少寻找包含信息的正确文件的时间，但在依赖该工具来回答有关FDA指导文件中包含的信息的问题之前，可能需要进一步研究生成式人工智能的改进。通过包含额外的上下文信息、重新配置嵌入和分块参数以及其他提示工程技术来改变问题的措辞，可以提高完全正确和完整回答的率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Therapeutic innovation & regulatory science MEDICAL INFORMATICS-PHARMACOLOGY & PHARMACY

CiteScore

3.40

自引率

13.30%

发文量

127

期刊介绍： Therapeutic Innovation & Regulatory Science (TIRS) is the official scientific journal of DIA that strives to advance medical product discovery, development, regulation, and use through the publication of peer-reviewed original and review articles, commentaries, and letters to the editor across the spectrum of converting biomedical science into practical solutions to advance human health. The focus areas of the journal are as follows: Biostatistics Clinical Trials Product Development and Innovation Global Perspectives Policy Regulatory Science Product Safety Special Populations