Agent-guided AI-powered interpretation and reporting of nerve conduction studies and EMG (INSPIRE)

IF 3.7 3区医学 Q1 CLINICAL NEUROLOGY

Clinical Neurophysiology Pub Date : 2025-06-15 DOI:10.1016/j.clinph.2025.2110792

Alon Gorenshtein , Moran Sorka , Mohamed Khateb , Dvir Aran , Shahar Shelly

{"title":"Agent-guided AI-powered interpretation and reporting of nerve conduction studies and EMG (INSPIRE)","authors":"Alon Gorenshtein , Moran Sorka , Mohamed Khateb , Dvir Aran , Shahar Shelly","doi":"10.1016/j.clinph.2025.2110792","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>We aimed to create a tool for electrophysiologist enhancing and standardizing interpretation of neuromuscular electrodiagnostic tests (EDX) using state of the art generative AI technology.</div></div><div><h3>Methods</h3><div>We developed three model frameworks for interpreting and reporting EDX: (1) Base-LLM (large language model), employing one-shot inference; (2) INSPIRE (Agent-Guided AI-Powered Interpretation and Reporting of Nerve Conduction Studies and EMG), a multi-agent AI framework; and (3) INSPIRE-Lite, a cost-efficient version of INSPIRE. INSPIRE uses three agents integrating tools to read reference tables and long-context clinical neuromuscular textbook. Performance was evaluated using the AI-Generated EMG Report Score (AIGERS), a scoring system we developed.</div></div><div><h3>Results</h3><div>INSPIRE achieved an accuracy of 92.2 % for detecting normal versus abnormal tests, significantly outperforming the Base-LLM model, which achieved 62.6 % (p < 0.001). INSPIRE demonstrated significantly higher AIGERS scores overall and across the domains of finding, clinical diagnosis, and semantic concordance (p < 0.001). INSPIRE-Lite scored lower than INSPIRE in finding and clinical diagnosis (p = 0.001 and p = 0.004).</div></div><div><h3>Conclusion</h3><div>Our model integrates variables like patient medical history, current complaints, and EDX findings to manage and interpret EMG. Demonstrating superior performance while addressing hallucinations, data overload, and aiding prioritization and standardization.</div></div><div><h3>Significance</h3><div>This model enables comprehensive analysis by integrating diverse clinical variables, enhancing diagnostic accuracy and efficiency of EDX reports.</div></div>","PeriodicalId":10671,"journal":{"name":"Clinical Neurophysiology","volume":"177 ","pages":"Article 2110792"},"PeriodicalIF":3.7000,"publicationDate":"2025-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Neurophysiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1388245725006443","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

We aimed to create a tool for electrophysiologist enhancing and standardizing interpretation of neuromuscular electrodiagnostic tests (EDX) using state of the art generative AI technology.

Methods

We developed three model frameworks for interpreting and reporting EDX: (1) Base-LLM (large language model), employing one-shot inference; (2) INSPIRE (Agent-Guided AI-Powered Interpretation and Reporting of Nerve Conduction Studies and EMG), a multi-agent AI framework; and (3) INSPIRE-Lite, a cost-efficient version of INSPIRE. INSPIRE uses three agents integrating tools to read reference tables and long-context clinical neuromuscular textbook. Performance was evaluated using the AI-Generated EMG Report Score (AIGERS), a scoring system we developed.

Results

INSPIRE achieved an accuracy of 92.2 % for detecting normal versus abnormal tests, significantly outperforming the Base-LLM model, which achieved 62.6 % (p < 0.001). INSPIRE demonstrated significantly higher AIGERS scores overall and across the domains of finding, clinical diagnosis, and semantic concordance (p < 0.001). INSPIRE-Lite scored lower than INSPIRE in finding and clinical diagnosis (p = 0.001 and p = 0.004).

Conclusion

Our model integrates variables like patient medical history, current complaints, and EDX findings to manage and interpret EMG. Demonstrating superior performance while addressing hallucinations, data overload, and aiding prioritization and standardization.

Significance

This model enables comprehensive analysis by integrating diverse clinical variables, enhancing diagnostic accuracy and efficiency of EDX reports.

查看原文本刊更多论文

智能体引导的人工智能神经传导研究和肌电图的解释和报告（INSPIRE）

目的：利用最先进的生成式人工智能技术，为电生理学家创建一种工具，以增强和标准化神经肌肉电诊断测试（EDX）的解释。方法开发了三种用于EDX解释和报告的模型框架：(1)Base-LLM（大型语言模型），采用单次推理；(2) INSPIRE (Agent-Guided AI- powered Interpretation and Reporting of Nerve Conduction Studies and EMG)，一个多智能体AI框架；(3)低成本版INSPIRE- lite。INSPIRE使用三种代理整合工具来阅读参考表和长期临床神经肌肉教科书。使用人工智能生成的肌电报告评分（AIGERS）进行评估，这是我们开发的评分系统。结果sinspire检测正常与异常的准确率为92.2%，显著优于Base-LLM模型的62.6% (p <；0.001)。INSPIRE在总体和发现、临床诊断和语义一致性领域均表现出显著更高的AIGERS得分(p <；0.001)。INSPIRE- lite在发现和临床诊断方面得分低于INSPIRE （p = 0.001和p = 0.004）。我们的模型整合了患者病史、当前主诉和EDX结果等变量来管理和解释肌电图。展示卓越的性能，同时解决幻觉，数据过载，并帮助优先级和标准化。意义该模型通过整合多种临床变量进行综合分析，提高了EDX报告的诊断准确性和效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Clinical Neurophysiology 医学-临床神经学

CiteScore

8.70

自引率

6.40%

发文量

932

审稿时长

59 days

期刊介绍： As of January 1999, The journal Electroencephalography and Clinical Neurophysiology, and its two sections Electromyography and Motor Control and Evoked Potentials have amalgamated to become this journal - Clinical Neurophysiology. Clinical Neurophysiology is the official journal of the International Federation of Clinical Neurophysiology, the Brazilian Society of Clinical Neurophysiology, the Czech Society of Clinical Neurophysiology, the Italian Clinical Neurophysiology Society and the International Society of Intraoperative Neurophysiology.The journal is dedicated to fostering research and disseminating information on all aspects of both normal and abnormal functioning of the nervous system. The key aim of the publication is to disseminate scholarly reports on the pathophysiology underlying diseases of the central and peripheral nervous system of human patients. Clinical trials that use neurophysiological measures to document change are encouraged, as are manuscripts reporting data on integrated neuroimaging of central nervous function including, but not limited to, functional MRI, MEG, EEG, PET and other neuroimaging modalities.