From Support Vector Machines to Neural Networks: Advancing Automated Velopharyngeal Dysfunction Detection in Patients With Cleft Palate.

IF 1.6 4区医学 Q3 SURGERY

Annals of Plastic Surgery Pub Date : 2025-09-01 DOI:10.1097/SAP.0000000000004460

Noah Alter, Claiborne Lucas, Ricardo Torres-Guzman, Andrew James, Amy Stone, Maria E Powell, Scott Corlew, Weixin Liu, Bowen Qu, Zhijun Yin, Andrea Hiller, Michael Golinko, Matthew E Pontell

{"title":"From Support Vector Machines to Neural Networks: Advancing Automated Velopharyngeal Dysfunction Detection in Patients With Cleft Palate.","authors":"Noah Alter, Claiborne Lucas, Ricardo Torres-Guzman, Andrew James, Amy Stone, Maria E Powell, Scott Corlew, Weixin Liu, Bowen Qu, Zhijun Yin, Andrea Hiller, Michael Golinko, Matthew E Pontell","doi":"10.1097/SAP.0000000000004460","DOIUrl":null,"url":null,"abstract":"Background: The generation of intelligible speech is the single most important outcome after cleft palate repair. The development of velopharyngeal dysfunction (VPD) compromises the outcome, and the burden of VPD remains largely unknown in low- and middle-income countries (LMICs). To scale up VPD care in these areas, we continue to explore the use of artificial intelligence (AI) and machine learning (ML) for automatic detection of VPD from speech samples alone.Methods: An age-matched, single-institution cohort of 60 patients (30 control, 30 with VPD after cleft palate repair) generated approximately 8000 audio samples (4000 VPD and 4000 control). These samples were used to inform the development of a neural network-based, self-supervised deep learning ML model.Results: ML model testing with augmented and unaugmented data sets revealed accuracies of 1.0, macro precisions of 1.0, macro recalls of 1.0, and F1 scores of 1.0.Discussion: Although these results are promising and support the ability of ML models to detect VPD, the results likely indicate that the ML models are also picking up confounding data. Efforts are underway to address this problem while simultaneously employing disentanglement tactics to allow for multilingual speech analysis. The ability to clinically operationalize such a model could instantaneously enhance VPD care in LMICs for patients with cleft palate with little changes to existing healthcare infrastructure.","PeriodicalId":8060,"journal":{"name":"Annals of Plastic Surgery","volume":"95 3S Suppl 1","pages":"S55-S59"},"PeriodicalIF":1.6000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Plastic Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/SAP.0000000000004460","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The generation of intelligible speech is the single most important outcome after cleft palate repair. The development of velopharyngeal dysfunction (VPD) compromises the outcome, and the burden of VPD remains largely unknown in low- and middle-income countries (LMICs). To scale up VPD care in these areas, we continue to explore the use of artificial intelligence (AI) and machine learning (ML) for automatic detection of VPD from speech samples alone.

Methods: An age-matched, single-institution cohort of 60 patients (30 control, 30 with VPD after cleft palate repair) generated approximately 8000 audio samples (4000 VPD and 4000 control). These samples were used to inform the development of a neural network-based, self-supervised deep learning ML model.

Results: ML model testing with augmented and unaugmented data sets revealed accuracies of 1.0, macro precisions of 1.0, macro recalls of 1.0, and F1 scores of 1.0.

Discussion: Although these results are promising and support the ability of ML models to detect VPD, the results likely indicate that the ML models are also picking up confounding data. Efforts are underway to address this problem while simultaneously employing disentanglement tactics to allow for multilingual speech analysis. The ability to clinically operationalize such a model could instantaneously enhance VPD care in LMICs for patients with cleft palate with little changes to existing healthcare infrastructure.

查看原文本刊更多论文

从支持向量机到神经网络：推进腭裂患者腭咽功能障碍自动检测。

背景：可理解语言的产生是腭裂修复后最重要的结果。腭咽功能障碍（VPD）的发展会影响治疗结果，而在低收入和中等收入国家（LMICs）， VPD的负担在很大程度上仍然未知。为了扩大这些领域的VPD护理，我们继续探索使用人工智能（AI）和机器学习（ML）从语音样本中自动检测VPD。方法：60名年龄匹配的单机构队列患者（30名对照组，30名腭裂修复后VPD患者）产生了大约8000个音频样本（4000个VPD和4000个对照组）。这些样本用于开发基于神经网络的自监督深度学习ML模型。结果：增强和未增强数据集的ML模型测试显示准确率为1.0，宏观精度为1.0，宏观召回率为1.0，F1分数为1.0。讨论：尽管这些结果很有希望，并且支持机器学习模型检测VPD的能力，但结果可能表明机器学习模型也在收集混淆数据。人们正在努力解决这个问题，同时采用解纠缠策略来进行多语言语音分析。临床操作这种模型的能力可以在现有医疗基础设施几乎没有变化的情况下，立即提高低收入国家腭裂患者的VPD护理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of Plastic Surgery 医学-外科

CiteScore

2.70

自引率

13.30%

发文量

584

审稿时长

6 months

期刊介绍： The only independent journal devoted to general plastic and reconstructive surgery, Annals of Plastic Surgery serves as a forum for current scientific and clinical advances in the field and a sounding board for ideas and perspectives on its future. The journal publishes peer-reviewed original articles, brief communications, case reports, and notes in all areas of interest to the practicing plastic surgeon. There are also historical and current reviews, descriptions of surgical technique, and lively editorials and letters to the editor.