Automated Mass Extraction of Over 680,000 PICOs from Clinical Study Abstracts Using Generative AI: A Proof-of-Concept Study.

IF 4.5 Q2 PHARMACOLOGY & PHARMACY

Pharmaceutical Medicine Pub Date : 2024-09-01 Epub Date: 2024-09-26 DOI:10.1007/s40290-024-00539-6

Tim Reason, Julia Langham, Andy Gimblett

{"title":"Automated Mass Extraction of Over 680,000 PICOs from Clinical Study Abstracts Using Generative AI: A Proof-of-Concept Study.","authors":"Tim Reason, Julia Langham, Andy Gimblett","doi":"10.1007/s40290-024-00539-6","DOIUrl":null,"url":null,"abstract":"Background: Generative artificial intelligence (GenAI) shows promise in automating key tasks involved in conducting systematic literature reviews (SLRs), including screening, bias assessment and data extraction. This potential automation is increasingly relevant as pharmaceutical developers face challenging requirements for timely and precise SLRs using the population, intervention, comparator and outcome (PICO) framework, such as those under the impending European Union (EU) Health Technology Assessment Regulation 2021/2282 (HTAR). This proof-of-concept study aimed to evaluate the feasibility, accuracy and efficiency of using GenAI for mass extraction of PICOs from PubMed abstracts.Methods: Abstracts were retrieved from PubMed using a search string targeting randomised controlled trials. A PubMed clinical study 'specific/narrow' filter was also applied. Retrieved abstracts were processed using the OpenAI Batch application programming interface (API), which allowed parallel processing and interaction with Generative Pre-trained Transformer 4 Omni (GPT-4o) via custom Python scripts. PICO elements were extracted using a zero-shot prompting strategy. Results were stored in CSV files and subsequently imported into a PostgreSQL database.Results: The PubMed search returned 682,667 abstracts. PICOs from all abstracts were extracted in < 3 h, with an average processing time of 200 s per 1000 abstracts. A total of 395,992,770 tokens were processed, with an average of 580 tokens per abstract. The total cost was $3390. On the basis of a random sample of 350 abstracts, human verification confirmed that GPT-4o accurately and comprehensively extracted 342 (98%) of all PICOs, with only outcome elements rarely missed.Conclusions: Using GenAI to extract PICOs from clinical study abstracts could fundamentally transform the way SLRs are conducted. By enabling pharmaceutical developers to anticipate PICO requirements, this approach allows for proactive preparation for the EU HTAR process, or other health technology assessments (HTAs), streamlining efficiency and reducing the burden of meeting these requirements.","PeriodicalId":19778,"journal":{"name":"Pharmaceutical Medicine","volume":" ","pages":"365-372"},"PeriodicalIF":4.5000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11473607/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmaceutical Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s40290-024-00539-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/26 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Generative artificial intelligence (GenAI) shows promise in automating key tasks involved in conducting systematic literature reviews (SLRs), including screening, bias assessment and data extraction. This potential automation is increasingly relevant as pharmaceutical developers face challenging requirements for timely and precise SLRs using the population, intervention, comparator and outcome (PICO) framework, such as those under the impending European Union (EU) Health Technology Assessment Regulation 2021/2282 (HTAR). This proof-of-concept study aimed to evaluate the feasibility, accuracy and efficiency of using GenAI for mass extraction of PICOs from PubMed abstracts.

Methods: Abstracts were retrieved from PubMed using a search string targeting randomised controlled trials. A PubMed clinical study 'specific/narrow' filter was also applied. Retrieved abstracts were processed using the OpenAI Batch application programming interface (API), which allowed parallel processing and interaction with Generative Pre-trained Transformer 4 Omni (GPT-4o) via custom Python scripts. PICO elements were extracted using a zero-shot prompting strategy. Results were stored in CSV files and subsequently imported into a PostgreSQL database.

Results: The PubMed search returned 682,667 abstracts. PICOs from all abstracts were extracted in < 3 h, with an average processing time of 200 s per 1000 abstracts. A total of 395,992,770 tokens were processed, with an average of 580 tokens per abstract. The total cost was $3390. On the basis of a random sample of 350 abstracts, human verification confirmed that GPT-4o accurately and comprehensively extracted 342 (98%) of all PICOs, with only outcome elements rarely missed.

Conclusions: Using GenAI to extract PICOs from clinical study abstracts could fundamentally transform the way SLRs are conducted. By enabling pharmaceutical developers to anticipate PICO requirements, this approach allows for proactive preparation for the EU HTAR process, or other health technology assessments (HTAs), streamlining efficiency and reducing the burden of meeting these requirements.

Abstract Image

查看原文本刊更多论文

使用生成式人工智能从临床研究摘要中大量自动提取 68 万多条 PICO：概念验证研究。

背景：生成式人工智能（GenAI）有望实现系统性文献综述（SLR）关键任务的自动化，包括筛选、偏倚评估和数据提取。这种潜在的自动化越来越具有现实意义，因为药品开发商面临着使用人群、干预措施、参照物和结果（PICO）框架进行及时、精确的系统文献综述的挑战性要求，例如即将实施的欧盟（EU）《健康技术评估条例 2021/2282 》（HTAR）所规定的要求。这项概念验证研究旨在评估使用 GenAI 从 PubMed 摘要中大量提取 PICO 的可行性、准确性和效率：方法：使用针对随机对照试验的搜索字符串从 PubMed 中检索摘要。还应用了 PubMed 临床研究 "特定/狭窄 "过滤器。检索到的摘要使用 OpenAI Batch 应用程序编程接口 (API) 进行处理，该接口允许并行处理，并通过自定义 Python 脚本与 Generative Pre-trained Transformer 4 Omni (GPT-4o) 进行交互。PICO 元素的提取采用了零点提示策略。结果存储在 CSV 文件中，随后导入 PostgreSQL 数据库：PubMed检索返回了682,667篇摘要。从所有摘要中提取 PICOs 的时间小于 3 小时，平均每 1000 份摘要的处理时间为 200 秒。共处理了 395,992,770 个标记，平均每个摘要 580 个标记。总成本为 3390 美元。在随机抽样 350 篇摘要的基础上，人工验证证实 GPT-4o 准确而全面地提取了所有 PICO 中的 342 个（98%），只有结果元素极少遗漏：使用 GenAI 从临床研究摘要中提取 PICO 可以从根本上改变 SLR 的开展方式。这种方法使药品开发人员能够预测 PICO 要求，从而为欧盟 HTAR 流程或其他健康技术评估 (HTA) 做好积极准备，提高效率并减轻满足这些要求的负担。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pharmaceutical Medicine PHARMACOLOGY & PHARMACY-

CiteScore

5.10

自引率

4.00%

发文量

期刊介绍： Pharmaceutical Medicine is a specialist discipline concerned with medical aspects of the discovery, development, evaluation, registration, regulation, monitoring, marketing, distribution and pricing of medicines, drug-device and drug-diagnostic combinations. The Journal disseminates information to support the community of professionals working in these highly inter-related functions. Key areas include translational medicine, clinical trial design, pharmacovigilance, clinical toxicology, drug regulation, clinical pharmacology, biostatistics and pharmacoeconomics. The Journal includes:Overviews of contentious or emerging issues.Comprehensive narrative reviews that provide an authoritative source of information on topical issues.Systematic reviews that collate empirical evidence to answer a specific research question, using explicit, systematic methods as outlined by PRISMA statement.Original research articles reporting the results of well-designed studies with a strong link to wider areas of clinical research.Additional digital features (including animated abstracts, video abstracts, slide decks, audio slides, instructional videos, infographics, podcasts and animations) can be published with articles; these are designed to increase the visibility, readership and educational value of the journal’s content. In addition, articles published in Pharmaceutical Medicine may be accompanied by plain language summaries to assist readers who have some knowledge of, but not in-depth expertise in, the area to understand important medical advances.All manuscripts are subject to peer review by international experts. Letters to the Editor are welcomed and will be considered for publication.