Jung-Hyun Lee, Eunhee Choi, Robert McDougal, William W Lytton
{"title":"GPT-4 Performance for Neurologic Localization.","authors":"Jung-Hyun Lee, Eunhee Choi, Robert McDougal, William W Lytton","doi":"10.1212/CPJ.0000000000200293","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objectives: </strong>In health care, large language models such as Generative Pretrained Transformers (GPTs), trained on extensive text datasets, have potential applications in reducing health care disparities across regions and populations. Previous software developed for lesion localization has been limited in scope. This study aims to evaluate the capability of GPT-4 for lesion localization based on clinical presentation.</p><p><strong>Methods: </strong>GPT-4 was prompted using history and neurologic physical examination (H&P) from published cases of acute stroke followed by questions for clinical reasoning with answering for \"single or multiple lesions,\" \"side,\" and \"brain region\" using Zero-Shot Chain-of-Thought and Text Classification prompting. GPT-4 output on 3 separate trials for each of 46 cases was compared with imaging-based localization.</p><p><strong>Results: </strong>GPT-4 successfully processed raw text from H&P to generate accurate neuroanatomical localization and detailed clinical reasoning. Performance metrics across trial-based analysis for specificity, sensitivity, precision, and F1-score were 0.87, 0.74, 0.75, and 0.74, respectively, for side; 0.94, 0.85, 0.84, and 0.85, respectively, for brain region. Class labels within the brain region were similarly high for all regions except the cerebellum and were also similar when considering all 3 trials to examine metrics by case. Errors were due to extrinsic causes-inadequate information in the published cases, and intrinsic causes-failures of logic or inadequate knowledge base.</p><p><strong>Discussion: </strong>This study reveals capabilities of GPT-4 in the localization of acute stroke lesions, showing a potential future role as a clinical tool in neurology.</p>","PeriodicalId":19136,"journal":{"name":"Neurology. Clinical practice","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11003355/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurology. Clinical practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1212/CPJ.0000000000200293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/27 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background and objectives: In health care, large language models such as Generative Pretrained Transformers (GPTs), trained on extensive text datasets, have potential applications in reducing health care disparities across regions and populations. Previous software developed for lesion localization has been limited in scope. This study aims to evaluate the capability of GPT-4 for lesion localization based on clinical presentation.
Methods: GPT-4 was prompted using history and neurologic physical examination (H&P) from published cases of acute stroke followed by questions for clinical reasoning with answering for "single or multiple lesions," "side," and "brain region" using Zero-Shot Chain-of-Thought and Text Classification prompting. GPT-4 output on 3 separate trials for each of 46 cases was compared with imaging-based localization.
Results: GPT-4 successfully processed raw text from H&P to generate accurate neuroanatomical localization and detailed clinical reasoning. Performance metrics across trial-based analysis for specificity, sensitivity, precision, and F1-score were 0.87, 0.74, 0.75, and 0.74, respectively, for side; 0.94, 0.85, 0.84, and 0.85, respectively, for brain region. Class labels within the brain region were similarly high for all regions except the cerebellum and were also similar when considering all 3 trials to examine metrics by case. Errors were due to extrinsic causes-inadequate information in the published cases, and intrinsic causes-failures of logic or inadequate knowledge base.
Discussion: This study reveals capabilities of GPT-4 in the localization of acute stroke lesions, showing a potential future role as a clinical tool in neurology.
期刊介绍:
Neurology® Genetics is an online open access journal publishing peer-reviewed reports in the field of neurogenetics. The journal publishes original articles in all areas of neurogenetics including rare and common genetic variations, genotype-phenotype correlations, outlier phenotypes as a result of mutations in known disease genes, and genetic variations with a putative link to diseases. Articles include studies reporting on genetic disease risk, pharmacogenomics, and results of gene-based clinical trials (viral, ASO, etc.). Genetically engineered model systems are not a primary focus of Neurology® Genetics, but studies using model systems for treatment trials, including well-powered studies reporting negative results, are welcome.