Near Real-Time Syndromic Surveillance of Emergency Department Triage Texts Using Natural Language Processing: Case Study in Febrile Convulsion Detection.
Sedigh Khademi, Christopher Palmer, Muhammad Javed, Gerardo Luis Dimaguila, Hazel Clothier, Jim Buttery, Jim Black
{"title":"Near Real-Time Syndromic Surveillance of Emergency Department Triage Texts Using Natural Language Processing: Case Study in Febrile Convulsion Detection.","authors":"Sedigh Khademi, Christopher Palmer, Muhammad Javed, Gerardo Luis Dimaguila, Hazel Clothier, Jim Buttery, Jim Black","doi":"10.2196/54449","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Collecting information on adverse events following immunization from as many sources as possible is critical for promptly identifying potential safety concerns and taking appropriate actions. Febrile convulsions are recognized as an important potential reaction to vaccination in children aged <6 years.</p><p><strong>Objective: </strong>The primary aim of this study was to evaluate the performance of natural language processing techniques and machine learning (ML) models for the rapid detection of febrile convulsion presentations in emergency departments (EDs), especially with respect to the minimum training data requirements to obtain optimum model performance. In addition, we examined the deployment requirements for a ML model to perform real-time monitoring of ED triage notes.</p><p><strong>Methods: </strong>We developed a pattern matching approach as a baseline and evaluated ML models for the classification of febrile convulsions in ED triage notes to determine both their training requirements and their effectiveness in detecting febrile convulsions. We measured their performance during training and then compared the deployed models' result on new incoming ED data.</p><p><strong>Results: </strong>Although the best standard neural networks had acceptable performance and were low-resource models, transformer-based models outperformed them substantially, justifying their ongoing deployment.</p><p><strong>Conclusions: </strong>Using natural language processing, particularly with the use of large language models, offers significant advantages in syndromic surveillance. Large language models make highly effective classifiers, and their text generation capacity can be used to enhance the quality and diversity of training data.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e54449"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11399745/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/54449","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Collecting information on adverse events following immunization from as many sources as possible is critical for promptly identifying potential safety concerns and taking appropriate actions. Febrile convulsions are recognized as an important potential reaction to vaccination in children aged <6 years.
Objective: The primary aim of this study was to evaluate the performance of natural language processing techniques and machine learning (ML) models for the rapid detection of febrile convulsion presentations in emergency departments (EDs), especially with respect to the minimum training data requirements to obtain optimum model performance. In addition, we examined the deployment requirements for a ML model to perform real-time monitoring of ED triage notes.
Methods: We developed a pattern matching approach as a baseline and evaluated ML models for the classification of febrile convulsions in ED triage notes to determine both their training requirements and their effectiveness in detecting febrile convulsions. We measured their performance during training and then compared the deployed models' result on new incoming ED data.
Results: Although the best standard neural networks had acceptable performance and were low-resource models, transformer-based models outperformed them substantially, justifying their ongoing deployment.
Conclusions: Using natural language processing, particularly with the use of large language models, offers significant advantages in syndromic surveillance. Large language models make highly effective classifiers, and their text generation capacity can be used to enhance the quality and diversity of training data.
背景:从尽可能多的渠道收集免疫接种后不良反应的信息对于及时发现潜在的安全问题并采取适当的措施至关重要。热性惊厥被认为是目标年龄儿童接种疫苗后的一个重要潜在反应:本研究的主要目的是评估自然语言处理技术和机器学习(ML)模型在快速检测急诊科(ED)中发热惊厥表现方面的性能,尤其是在获得最佳模型性能所需的最低训练数据方面。此外,我们还研究了实时监控急诊科分诊记录的 ML 模型的部署要求:方法:我们开发了一种模式匹配方法作为基线,并评估了用于对急诊室分诊记录中的发热惊厥进行分类的 ML 模型,以确定其训练要求和检测发热惊厥的有效性。我们测量了这些模型在训练过程中的表现,然后比较了已部署模型在新收到的急诊室数据上的结果:结果:尽管最佳标准神经网络具有可接受的性能,而且是低资源模型,但基于转换器的模型的性能大大优于它们,因此有理由继续部署这些模型:结论:使用自然语言处理,尤其是使用大型语言模型,在综合症监测方面具有显著优势。大型语言模型是高效的分类器,其文本生成能力可用于提高训练数据的质量和多样性。