AI language model applications for early diagnosis of childhood epilepsy based on unstructured first-visit patient narratives: A cohort study.

IF 2.7 4区医学 Q3 CLINICAL NEUROLOGY

Epileptic Disorders Pub Date : 2025-10-03 DOI:10.1002/epd2.70109

Jitse Loyens, Geertruida Slinger, Nynke Doornebal, Kees P J Braun, Eric van Diessen, Willem M Otte

{"title":"AI language model applications for early diagnosis of childhood epilepsy based on unstructured first-visit patient narratives: A cohort study.","authors":"Jitse Loyens, Geertruida Slinger, Nynke Doornebal, Kees P J Braun, Eric van Diessen, Willem M Otte","doi":"10.1002/epd2.70109","DOIUrl":null,"url":null,"abstract":"Objective: Language serves as an indispensable source of information for diagnosing epilepsy, and its computational analysis is increasingly explored. This study assessed - and compared - the diagnostic value of different language model applications in extracting information. The aim is to identify language patterns that may contain useful clinical information that is not overtly considered by the clinician from first-visit documentation to improve the early diagnosis of childhood epilepsy.Methods: We analyzed 1561 patient letters from the first two seizure clinics. The dataset was divided into training and test sets to evaluate performance and generalizability. We employed an established Naïve Bayes model as a natural language processing technique and a sentence-embedding (large language) model based on the Bidirectional Encoder Representations from Transformers (BERT) architecture. Both models analyzed anamnesis texts as noted by the treating physician only. Within the training sets, we identified predictive features consisting of keywords indicative of 'epilepsy' or 'no epilepsy.' Model outputs were compared to the clinician's final diagnosis (gold standard) after a two-year follow-up period. We computed accuracy, sensitivity, and specificity for both models.Results: The Naïve Bayes model achieved an accuracy of 0.73 (95% CI: 0.68-0.78), with a sensitivity of 0.79 (95% CI: 0.74-0.85) and a specificity of 0.62 (95% CI: 0.52-0.72). The sentence-embedding model demonstrated comparable performance with an accuracy of 0.74 (95% CI: 0.68-0.79), a sensitivity of 0.74 (95% CI: 0.68-0.80), and a specificity of 0.73 (95% CI: 0.61-0.84).Significance: Both models demonstrated relatively good performance in diagnosing childhood epilepsy solely based on the first-visit patient anamnesis text. Notably, the more advanced sentence-embedding model showed no improvement over the computationally simpler Naïve Bayes model. This suggests that modeling of anamnesis data does depend on word order for this particular classification task. Further refinement and exploration of language models and computational linguistic approaches are necessary to enhance diagnostic accuracy in clinical practice.","PeriodicalId":50508,"journal":{"name":"Epileptic Disorders","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epileptic Disorders","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/epd2.70109","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: Language serves as an indispensable source of information for diagnosing epilepsy, and its computational analysis is increasingly explored. This study assessed - and compared - the diagnostic value of different language model applications in extracting information. The aim is to identify language patterns that may contain useful clinical information that is not overtly considered by the clinician from first-visit documentation to improve the early diagnosis of childhood epilepsy.

Methods: We analyzed 1561 patient letters from the first two seizure clinics. The dataset was divided into training and test sets to evaluate performance and generalizability. We employed an established Naïve Bayes model as a natural language processing technique and a sentence-embedding (large language) model based on the Bidirectional Encoder Representations from Transformers (BERT) architecture. Both models analyzed anamnesis texts as noted by the treating physician only. Within the training sets, we identified predictive features consisting of keywords indicative of 'epilepsy' or 'no epilepsy.' Model outputs were compared to the clinician's final diagnosis (gold standard) after a two-year follow-up period. We computed accuracy, sensitivity, and specificity for both models.

Results: The Naïve Bayes model achieved an accuracy of 0.73 (95% CI: 0.68-0.78), with a sensitivity of 0.79 (95% CI: 0.74-0.85) and a specificity of 0.62 (95% CI: 0.52-0.72). The sentence-embedding model demonstrated comparable performance with an accuracy of 0.74 (95% CI: 0.68-0.79), a sensitivity of 0.74 (95% CI: 0.68-0.80), and a specificity of 0.73 (95% CI: 0.61-0.84).

Significance: Both models demonstrated relatively good performance in diagnosing childhood epilepsy solely based on the first-visit patient anamnesis text. Notably, the more advanced sentence-embedding model showed no improvement over the computationally simpler Naïve Bayes model. This suggests that modeling of anamnesis data does depend on word order for this particular classification task. Further refinement and exploration of language models and computational linguistic approaches are necessary to enhance diagnostic accuracy in clinical practice.

查看原文本刊更多论文

基于非结构化首次就诊患者叙述的AI语言模型在儿童癫痫早期诊断中的应用：一项队列研究。

目的：语言是诊断癫痫不可缺少的信息来源，其计算分析的探索日益深入。本研究评估并比较了不同语言模型应用在提取信息中的诊断价值。目的是识别可能包含有用的临床信息的语言模式，这些信息是临床医生从首次就诊文件中没有公开考虑的，以改善儿童癫痫的早期诊断。方法：对前两个癫痫门诊1561例患者来信进行分析。数据集被分为训练集和测试集，以评估性能和泛化性。我们采用建立的Naïve贝叶斯模型作为自然语言处理技术和基于双向编码器表示（BERT）架构的句子嵌入（大语言）模型。两种模型都只分析由主治医生记录的记忆文本。在训练集中，我们确定了由指示“癫痫”或“非癫痫”的关键字组成的预测特征。在两年的随访期后，将模型输出与临床医生的最终诊断（金标准）进行比较。我们计算了两种模型的准确性、敏感性和特异性。结果：Naïve贝叶斯模型的准确率为0.73 (95% CI: 0.68-0.78)，灵敏度为0.79 (95% CI: 0.74-0.85)，特异性为0.62 （95% CI: 0.52-0.72）。句子嵌入模型的准确率为0.74 (95% CI: 0.68-0.79)，灵敏度为0.74 (95% CI: 0.68-0.80)，特异性为0.73 （95% CI: 0.61-0.84）。意义：两种模型在仅基于首次就诊患者记忆文本诊断儿童癫痫方面均表现出较好的表现。值得注意的是，更高级的句子嵌入模型与计算更简单的Naïve贝叶斯模型相比没有任何改进。这表明记忆数据的建模确实依赖于这个特定分类任务的词序。进一步完善和探索语言模型和计算语言方法对于提高临床实践中的诊断准确性是必要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Epileptic Disorders 医学-临床神经学

CiteScore

4.10

自引率

8.70%

发文量

138

审稿时长

6-12 weeks

期刊介绍： Epileptic Disorders is the leading forum where all experts and medical studentswho wish to improve their understanding of epilepsy and related disorders can share practical experiences surrounding diagnosis and care, natural history, and management of seizures. Epileptic Disorders is the official E-journal of the International League Against Epilepsy for educational communication. As the journal celebrates its 20th anniversary, it will now be available only as an online version. Its mission is to create educational links between epileptologists and other health professionals in clinical practice and scientists or physicians in research-based institutions. This change is accompanied by an increase in the number of issues per year, from 4 to 6, to ensure regular diffusion of recently published material (high quality Review and Seminar in Epileptology papers; Original Research articles or Case reports of educational value; MultiMedia Teaching Material), to serve the global medical community that cares for those affected by epilepsy.