人工智能驱动的不良事件分析的可行性：使用大型语言模型分析微波消融故障数据。

IF 2.9 3区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes Pub Date : 2025-02-01 Epub Date: 2024-08-21 DOI:10.1177/08465371241269436

Blair E Warren, Fahd Alkhalifah, Aida Ahrari, Adam Min, Aly Fawzy, Ganesan Annamalai, Arash Jaberi, Robert Beecroft, John R Kachura, Sebastian C Mafeld

{"title":"人工智能驱动的不良事件分析的可行性：使用大型语言模型分析微波消融故障数据。","authors":"Blair E Warren, Fahd Alkhalifah, Aida Ahrari, Adam Min, Aly Fawzy, Ganesan Annamalai, Arash Jaberi, Robert Beecroft, John R Kachura, Sebastian C Mafeld","doi":"10.1177/08465371241269436","DOIUrl":null,"url":null,"abstract":"Objectives: Determine if a large language model (LLM, GPT-4) can label and consolidate and analyze interventional radiology (IR) microwave ablation device safety event data into meaningful summaries similar to humans. Methods: Microwave ablation safety data from January 1, 2011 to October 31, 2023 were collected and type of failure was categorized by human readers. Using GPT-4 and iterative prompt development, the data were classified. Iterative summarization of the reports was performed using GPT-4 to generate a final summary of the large text corpus. Results: Training (n = 25), validation (n = 639), and test (n = 79) data were split to reflect real-world deployment of an LLM for this task. GPT-4 demonstrated high accuracy in the multiclass classification problem of microwave ablation device data (accuracy [95% CI]: training data 96.0% [79.7, 99.9], validation 86.4% [83.5, 89.0], test 87.3% [78.0, 93.8]). The text content was distilled through GPT-4 and iterative summarization prompts. A final summary was created which reflected the clinically relevant insights from the microwave ablation data relative to human interpretation but had inaccurate event class counts. Conclusion: The LLM emulated the human analysis, suggesting feasibility of using LLMs to process large volumes of IR safety data as a tool for clinicians. It accurately labelled microwave ablation device event data by type of malfunction through few-shot learning. Content distillation was used to analyze a large text corpus (>650 reports) and generate an insightful summary which was like the human interpretation.","PeriodicalId":55290,"journal":{"name":"Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes","volume":" ","pages":"171-179"},"PeriodicalIF":2.9000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feasibility of Artificial Intelligence Powered Adverse Event Analysis: Using a Large Language Model to Analyze Microwave Ablation Malfunction Data.\",\"authors\":\"Blair E Warren, Fahd Alkhalifah, Aida Ahrari, Adam Min, Aly Fawzy, Ganesan Annamalai, Arash Jaberi, Robert Beecroft, John R Kachura, Sebastian C Mafeld\",\"doi\":\"10.1177/08465371241269436\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objectives: Determine if a large language model (LLM, GPT-4) can label and consolidate and analyze interventional radiology (IR) microwave ablation device safety event data into meaningful summaries similar to humans. Methods: Microwave ablation safety data from January 1, 2011 to October 31, 2023 were collected and type of failure was categorized by human readers. Using GPT-4 and iterative prompt development, the data were classified. Iterative summarization of the reports was performed using GPT-4 to generate a final summary of the large text corpus. Results: Training (n = 25), validation (n = 639), and test (n = 79) data were split to reflect real-world deployment of an LLM for this task. GPT-4 demonstrated high accuracy in the multiclass classification problem of microwave ablation device data (accuracy [95% CI]: training data 96.0% [79.7, 99.9], validation 86.4% [83.5, 89.0], test 87.3% [78.0, 93.8]). The text content was distilled through GPT-4 and iterative summarization prompts. A final summary was created which reflected the clinically relevant insights from the microwave ablation data relative to human interpretation but had inaccurate event class counts. Conclusion: The LLM emulated the human analysis, suggesting feasibility of using LLMs to process large volumes of IR safety data as a tool for clinicians. It accurately labelled microwave ablation device event data by type of malfunction through few-shot learning. Content distillation was used to analyze a large text corpus (>650 reports) and generate an insightful summary which was like the human interpretation.\",\"PeriodicalId\":55290,\"journal\":{\"name\":\"Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes\",\"volume\":\" \",\"pages\":\"171-179\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/08465371241269436\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/08465371241269436","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/21 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

目标：确定大型语言模型（LLM，GPT-4）是否能标记、整合和分析介入放射学（IR）微波消融设备安全事件数据，并将其转化为与人类类似的有意义的摘要。方法收集了 2011 年 1 月 1 日至 2023 年 10 月 31 日的微波消融安全数据，并由人类读者对故障类型进行分类。使用 GPT-4 和迭代提示开发对数据进行分类。使用 GPT-4 对报告进行迭代总结，以生成大型文本语料库的最终总结。结果：对训练数据（n = 25）、验证数据（n = 639）和测试数据（n = 79）进行了拆分，以反映 LLM 在该任务中的实际部署情况。GPT-4 在微波消融设备数据的多类分类问题上表现出很高的准确率（准确率 [95% CI]：训练数据 96.0% [79.7, 99.9]，验证 86.4% [83.5, 89.0]，测试 87.3% [78.0, 93.8]）。通过 GPT-4 和迭代总结提示，对文本内容进行了提炼。最后创建的摘要反映了微波消融数据中与人类解读相关的临床见解，但事件类别计数不准确。结论LLM 模拟了人类分析，表明使用 LLM 处理大量红外安全数据作为临床医生工具的可行性。它通过少量学习，按故障类型准确标注了微波消融设备事件数据。内容提炼被用于分析大型文本语料库（大于 650 份报告），并生成与人类解读相似的有深度的摘要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Feasibility of Artificial Intelligence Powered Adverse Event Analysis: Using a Large Language Model to Analyze Microwave Ablation Malfunction Data.

Objectives: Determine if a large language model (LLM, GPT-4) can label and consolidate and analyze interventional radiology (IR) microwave ablation device safety event data into meaningful summaries similar to humans. Methods: Microwave ablation safety data from January 1, 2011 to October 31, 2023 were collected and type of failure was categorized by human readers. Using GPT-4 and iterative prompt development, the data were classified. Iterative summarization of the reports was performed using GPT-4 to generate a final summary of the large text corpus. Results: Training (n = 25), validation (n = 639), and test (n = 79) data were split to reflect real-world deployment of an LLM for this task. GPT-4 demonstrated high accuracy in the multiclass classification problem of microwave ablation device data (accuracy [95% CI]: training data 96.0% [79.7, 99.9], validation 86.4% [83.5, 89.0], test 87.3% [78.0, 93.8]). The text content was distilled through GPT-4 and iterative summarization prompts. A final summary was created which reflected the clinically relevant insights from the microwave ablation data relative to human interpretation but had inaccurate event class counts. Conclusion: The LLM emulated the human analysis, suggesting feasibility of using LLMs to process large volumes of IR safety data as a tool for clinicians. It accurately labelled microwave ablation device event data by type of malfunction through few-shot learning. Content distillation was used to analyze a large text corpus (>650 reports) and generate an insightful summary which was like the human interpretation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Canadian Association of Radiologists Journal-Journal De L Association Canadienne Des Radiologistes 医学-核医学

CiteScore

6.20

自引率

12.90%

发文量

审稿时长

6-12 weeks

期刊介绍： The Canadian Association of Radiologists Journal is a peer-reviewed, Medline-indexed publication that presents a broad scientific review of radiology in Canada. The Journal covers such topics as abdominal imaging, cardiovascular radiology, computed tomography, continuing professional development, education and training, gastrointestinal radiology, health policy and practice, magnetic resonance imaging, musculoskeletal radiology, neuroradiology, nuclear medicine, pediatric radiology, radiology history, radiology practice guidelines and advisories, thoracic and cardiac imaging, trauma and emergency room imaging, ultrasonography, and vascular and interventional radiology. Article types considered for publication include original research articles, critically appraised topics, review articles, guest editorials, pictorial essays, technical notes, and letter to the Editor.