Charles J Nudelman, Virginia Tardini, Pasquale Bottalico
{"title":"Artificial Intelligence to Detect Voice Disorders: An AI-Supported Systematic Review of Accuracy Outcomes.","authors":"Charles J Nudelman, Virginia Tardini, Pasquale Bottalico","doi":"10.1016/j.jvoice.2025.09.021","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The objective of the present systematic review is to identify which artificial intelligence (AI) approaches have been used to successfully detect voice disorders. The review examines studies involving patients with non-neurological voice disorders and controls, where AI was applied to detect voice disorders. The primary outcome of interest is the accuracy of these AI models. Additionally, this review demonstrates how the procedures of conducting a systematic review can be supported by AI.</p><p><strong>Methods: </strong>Studies were eligible for inclusion if they implemented an AI approach to detect non-neurological voice disorders from healthy voice samples. A comprehensive search was conducted using PubMed/MEDLINE, Science Direct, Web of Science, EBSCO, and Scopus databases. Risk of bias was assessed via the Quality Assessment Tool for Diagnostic Accuracy Studies. The occurrences of the most common AI techniques utilized in the literature are presented, and a summary of their abilities to accurately detect a voice disorder is reported.</p><p><strong>Results: </strong>In total, 79 publications met the inclusion criteria. These studies included patient recordings from a variety of voice databases. The most common AI techniques implemented were Support Vector Machines (SVMs) (n = 28) and Convolutional Neural Networks (CNNs) (n = 22). The mean accuracy of the models in detecting voice disorders was 92% across all studies. Nine studies reported 100% accuracy, and 32 studies reported between 95% and 99%.</p><p><strong>Discussion: </strong>Strengths of the evidence include high accuracies across diverse models and datasets. Limitations include a limited variety of datasets and a trend of hyperoptimization without sufficient external validation. Clinicians and researchers should recognize that while current AI models show promise, future studies should prioritize robust external validation and more representative datasets.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2025.09.021","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The objective of the present systematic review is to identify which artificial intelligence (AI) approaches have been used to successfully detect voice disorders. The review examines studies involving patients with non-neurological voice disorders and controls, where AI was applied to detect voice disorders. The primary outcome of interest is the accuracy of these AI models. Additionally, this review demonstrates how the procedures of conducting a systematic review can be supported by AI.
Methods: Studies were eligible for inclusion if they implemented an AI approach to detect non-neurological voice disorders from healthy voice samples. A comprehensive search was conducted using PubMed/MEDLINE, Science Direct, Web of Science, EBSCO, and Scopus databases. Risk of bias was assessed via the Quality Assessment Tool for Diagnostic Accuracy Studies. The occurrences of the most common AI techniques utilized in the literature are presented, and a summary of their abilities to accurately detect a voice disorder is reported.
Results: In total, 79 publications met the inclusion criteria. These studies included patient recordings from a variety of voice databases. The most common AI techniques implemented were Support Vector Machines (SVMs) (n = 28) and Convolutional Neural Networks (CNNs) (n = 22). The mean accuracy of the models in detecting voice disorders was 92% across all studies. Nine studies reported 100% accuracy, and 32 studies reported between 95% and 99%.
Discussion: Strengths of the evidence include high accuracies across diverse models and datasets. Limitations include a limited variety of datasets and a trend of hyperoptimization without sufficient external validation. Clinicians and researchers should recognize that while current AI models show promise, future studies should prioritize robust external validation and more representative datasets.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.