{"title":"Harnessing machine learning in diagnosing complex hoarseness cases","authors":"Ariel Roitman , Yiftach Edelstain , Chen Katzir , Hadas Ofir , Nimrod Peleg , Ilana Doweck , Yoav Yanir","doi":"10.1016/j.amjoto.2024.104533","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Traditional vocal fold pathology recognition typically requires expertise of laryngologists and advanced instruments, primarily through direct visualization. This study aims to augment this conventional paradigm by introducing a parallel diagnostic procedure. Our objective is to harness a machine-learning algorithm designed to discern intricate patterns within patients' voice recordings to distinguish not only between healthy and hoarse voices but also among various specific disorders.</div></div><div><h3>Materials and methods</h3><div>We employed a machine-learning algorithm, utilizing transfer learning on the HuBERT model with Saarbruecken Voice Database samples. The study was conducted in two stages: a binary classifier distinguishes healthy and hoarse voices, while a subsequent multi-class classifier identifies specific voice disorders. Data from 2103 sessions, including over 25,000 components, representing diverse pathologies and healthy individuals, was analyzed. The models were trained, validated, and tested with a focus on robustness and accuracy in diagnosis.</div></div><div><h3>Results</h3><div>The binary classifier achieved 82 % accuracy in distinguishing healthy from pathological voices. The multi-class algorithm which aims to identify specific laryngeal disorders obtained the highest accuracy (>93 %) for Laryngeal Dystonia. Noteworthy is the persistent challenge posed by Laryngeal Dystonia, a condition lacking a definitive diagnostic modality.</div></div><div><h3>Conclusions</h3><div>Our findings demonstrate the feasibility of utilizing machine-learning algorithms to process voice samples, categorizing them into distinct pathologies. This approach holds potential for enhance patient triage, streamline diagnostics, and elevate overall patient care. Particularly valuable for challenging diagnoses, such as Laryngeal Dystonia, this method underscores the transformative role of machine learning in optimizing healthcare practices.</div></div>","PeriodicalId":7591,"journal":{"name":"American Journal of Otolaryngology","volume":"46 1","pages":"Article 104533"},"PeriodicalIF":1.8000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Otolaryngology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0196070924003193","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
Traditional vocal fold pathology recognition typically requires expertise of laryngologists and advanced instruments, primarily through direct visualization. This study aims to augment this conventional paradigm by introducing a parallel diagnostic procedure. Our objective is to harness a machine-learning algorithm designed to discern intricate patterns within patients' voice recordings to distinguish not only between healthy and hoarse voices but also among various specific disorders.
Materials and methods
We employed a machine-learning algorithm, utilizing transfer learning on the HuBERT model with Saarbruecken Voice Database samples. The study was conducted in two stages: a binary classifier distinguishes healthy and hoarse voices, while a subsequent multi-class classifier identifies specific voice disorders. Data from 2103 sessions, including over 25,000 components, representing diverse pathologies and healthy individuals, was analyzed. The models were trained, validated, and tested with a focus on robustness and accuracy in diagnosis.
Results
The binary classifier achieved 82 % accuracy in distinguishing healthy from pathological voices. The multi-class algorithm which aims to identify specific laryngeal disorders obtained the highest accuracy (>93 %) for Laryngeal Dystonia. Noteworthy is the persistent challenge posed by Laryngeal Dystonia, a condition lacking a definitive diagnostic modality.
Conclusions
Our findings demonstrate the feasibility of utilizing machine-learning algorithms to process voice samples, categorizing them into distinct pathologies. This approach holds potential for enhance patient triage, streamline diagnostics, and elevate overall patient care. Particularly valuable for challenging diagnoses, such as Laryngeal Dystonia, this method underscores the transformative role of machine learning in optimizing healthcare practices.
期刊介绍:
Be fully informed about developments in otology, neurotology, audiology, rhinology, allergy, laryngology, speech science, bronchoesophagology, facial plastic surgery, and head and neck surgery. Featured sections include original contributions, grand rounds, current reviews, case reports and socioeconomics.