Harnessing machine learning in diagnosing complex hoarseness cases

IF 1.8 4区医学 Q2 OTORHINOLARYNGOLOGY

American Journal of Otolaryngology Pub Date : 2025-01-01 DOI:10.1016/j.amjoto.2024.104533

Ariel Roitman , Yiftach Edelstain , Chen Katzir , Hadas Ofir , Nimrod Peleg , Ilana Doweck , Yoav Yanir

{"title":"Harnessing machine learning in diagnosing complex hoarseness cases","authors":"Ariel Roitman , Yiftach Edelstain , Chen Katzir , Hadas Ofir , Nimrod Peleg , Ilana Doweck , Yoav Yanir","doi":"10.1016/j.amjoto.2024.104533","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Traditional vocal fold pathology recognition typically requires expertise of laryngologists and advanced instruments, primarily through direct visualization. This study aims to augment this conventional paradigm by introducing a parallel diagnostic procedure. Our objective is to harness a machine-learning algorithm designed to discern intricate patterns within patients' voice recordings to distinguish not only between healthy and hoarse voices but also among various specific disorders.</div></div><div><h3>Materials and methods</h3><div>We employed a machine-learning algorithm, utilizing transfer learning on the HuBERT model with Saarbruecken Voice Database samples. The study was conducted in two stages: a binary classifier distinguishes healthy and hoarse voices, while a subsequent multi-class classifier identifies specific voice disorders. Data from 2103 sessions, including over 25,000 components, representing diverse pathologies and healthy individuals, was analyzed. The models were trained, validated, and tested with a focus on robustness and accuracy in diagnosis.</div></div><div><h3>Results</h3><div>The binary classifier achieved 82 % accuracy in distinguishing healthy from pathological voices. The multi-class algorithm which aims to identify specific laryngeal disorders obtained the highest accuracy (>93 %) for Laryngeal Dystonia. Noteworthy is the persistent challenge posed by Laryngeal Dystonia, a condition lacking a definitive diagnostic modality.</div></div><div><h3>Conclusions</h3><div>Our findings demonstrate the feasibility of utilizing machine-learning algorithms to process voice samples, categorizing them into distinct pathologies. This approach holds potential for enhance patient triage, streamline diagnostics, and elevate overall patient care. Particularly valuable for challenging diagnoses, such as Laryngeal Dystonia, this method underscores the transformative role of machine learning in optimizing healthcare practices.</div></div>","PeriodicalId":7591,"journal":{"name":"American Journal of Otolaryngology","volume":"46 1","pages":"Article 104533"},"PeriodicalIF":1.8000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Otolaryngology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0196070924003193","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

Traditional vocal fold pathology recognition typically requires expertise of laryngologists and advanced instruments, primarily through direct visualization. This study aims to augment this conventional paradigm by introducing a parallel diagnostic procedure. Our objective is to harness a machine-learning algorithm designed to discern intricate patterns within patients' voice recordings to distinguish not only between healthy and hoarse voices but also among various specific disorders.

Materials and methods

We employed a machine-learning algorithm, utilizing transfer learning on the HuBERT model with Saarbruecken Voice Database samples. The study was conducted in two stages: a binary classifier distinguishes healthy and hoarse voices, while a subsequent multi-class classifier identifies specific voice disorders. Data from 2103 sessions, including over 25,000 components, representing diverse pathologies and healthy individuals, was analyzed. The models were trained, validated, and tested with a focus on robustness and accuracy in diagnosis.

Results

The binary classifier achieved 82 % accuracy in distinguishing healthy from pathological voices. The multi-class algorithm which aims to identify specific laryngeal disorders obtained the highest accuracy (>93 %) for Laryngeal Dystonia. Noteworthy is the persistent challenge posed by Laryngeal Dystonia, a condition lacking a definitive diagnostic modality.

Conclusions

Our findings demonstrate the feasibility of utilizing machine-learning algorithms to process voice samples, categorizing them into distinct pathologies. This approach holds potential for enhance patient triage, streamline diagnostics, and elevate overall patient care. Particularly valuable for challenging diagnoses, such as Laryngeal Dystonia, this method underscores the transformative role of machine learning in optimizing healthcare practices.

查看原文本刊更多论文

利用机器学习诊断复杂的声音嘶哑病例。

目的：传统的声带病理识别通常需要喉科医生的专业知识和先进的仪器，主要是通过直接可视化。本研究旨在通过引入平行诊断程序来增强这一传统范式。我们的目标是利用一种机器学习算法来识别患者语音记录中的复杂模式，不仅可以区分健康和沙哑的声音，还可以区分各种特定的疾病。材料和方法：我们采用了一种机器学习算法，利用Saarbruecken Voice Database样本对HuBERT模型进行迁移学习。该研究分两个阶段进行：二分类器区分健康和沙哑的声音，而随后的多分类器识别特定的声音障碍。研究人员分析了来自2103次会议的数据，其中包括超过25,000个组成部分，代表了不同的病理和健康个体。对模型进行了训练、验证和测试，重点是诊断的鲁棒性和准确性。结果：二分类器对健康声音和病理声音的区分准确率达到82%。该多类算法旨在识别特定的喉部疾病，对喉部肌张力障碍的准确率最高（bb0.93 %）。值得注意的是喉张力障碍所带来的持续挑战，这是一种缺乏明确诊断模式的疾病。结论：我们的研究结果证明了利用机器学习算法处理语音样本并将其分类为不同病理的可行性。这种方法具有增强患者分诊、简化诊断和提高整体患者护理水平的潜力。这种方法对于喉张力障碍等具有挑战性的诊断尤其有价值，它强调了机器学习在优化医疗保健实践中的变革作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

American Journal of Otolaryngology 医学-耳鼻喉科学

CiteScore

4.40

自引率

4.00%

发文量

378

审稿时长

41 days

期刊介绍： Be fully informed about developments in otology, neurotology, audiology, rhinology, allergy, laryngology, speech science, bronchoesophagology, facial plastic surgery, and head and neck surgery. Featured sections include original contributions, grand rounds, current reviews, case reports and socioeconomics.