Using Machine Learning for the Automated Segmentation and Detection of Swallows Obtained by Digital Cervical Auscultation in Preterm Neonates.

IF 3 3区医学 Q1 OTORHINOLARYNGOLOGY

Dysphagia Pub Date : 2025-09-12 DOI:10.1007/s00455-025-10879-3

Stephen So, Timothy Tadj, Belinda Schwerin, Anne B Chang, Seiji Humphries, Thuy T Frakking

{"title":"Using Machine Learning for the Automated Segmentation and Detection of Swallows Obtained by Digital Cervical Auscultation in Preterm Neonates.","authors":"Stephen So, Timothy Tadj, Belinda Schwerin, Anne B Chang, Seiji Humphries, Thuy T Frakking","doi":"10.1007/s00455-025-10879-3","DOIUrl":null,"url":null,"abstract":"<p><p>The clinical application of acoustic swallowing sound parameters collected from digital cervical auscultation is limited because of the time-consuming manual segmentation required by trained experts. The automated identification of swallowing sounds in children and adults from swallowing sound audio wavefiles using machine learning have accuracies between 76-95%. No data exists in preterm neonates. To determine if applying automated machine learning using a transfer learning approach could accurately identify and segment swallows from swallowing sounds collected in preterm neonates. Thin fluid swallow sounds were collected from 78 preterm neonates, median birth age 34 weeks gestation (range 25-36 weeks, 52.6% males) across 3 Australian special care nurseries. For the base machine learning model, a deep convolutional neural network (DCNN) pre-trained for audio event classification was used. With raw swallow audio data as input, embedding vectors from the base DCNN were generated and used to train a feedforward neural network to determine the presence of a swallow within an audio segment. The model showed high overall accuracy (94%) in identifying preterm swallows. Better model performance on bottle feeding swallows (Sensitivity, 95%; and specificity, 96%) was seen compared with breastfeeding swallows (sensitivity, 95%, specificity 92%). Interpretation: Our novel study demonstrates the successful use of transfer learning to accurately identify and segment digital swallowing sounds in preterm neonates. Application of this model could support the development of a digital CA app to automatically classify swallow sounds and improve objectivity for CA use in clinical practice within special care nurseries.</p>","PeriodicalId":11508,"journal":{"name":"Dysphagia","volume":" ","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dysphagia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00455-025-10879-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

The clinical application of acoustic swallowing sound parameters collected from digital cervical auscultation is limited because of the time-consuming manual segmentation required by trained experts. The automated identification of swallowing sounds in children and adults from swallowing sound audio wavefiles using machine learning have accuracies between 76-95%. No data exists in preterm neonates. To determine if applying automated machine learning using a transfer learning approach could accurately identify and segment swallows from swallowing sounds collected in preterm neonates. Thin fluid swallow sounds were collected from 78 preterm neonates, median birth age 34 weeks gestation (range 25-36 weeks, 52.6% males) across 3 Australian special care nurseries. For the base machine learning model, a deep convolutional neural network (DCNN) pre-trained for audio event classification was used. With raw swallow audio data as input, embedding vectors from the base DCNN were generated and used to train a feedforward neural network to determine the presence of a swallow within an audio segment. The model showed high overall accuracy (94%) in identifying preterm swallows. Better model performance on bottle feeding swallows (Sensitivity, 95%; and specificity, 96%) was seen compared with breastfeeding swallows (sensitivity, 95%, specificity 92%). Interpretation: Our novel study demonstrates the successful use of transfer learning to accurately identify and segment digital swallowing sounds in preterm neonates. Application of this model could support the development of a digital CA app to automatically classify swallow sounds and improve objectivity for CA use in clinical practice within special care nurseries.

查看原文本刊更多论文

基于机器学习的早产儿数字子宫颈听诊燕子声自动分割与检测。

由于需要训练有素的专家进行耗时的人工分割，因此数字听诊采集的声学吞咽声参数的临床应用受到限制。使用机器学习从吞咽声音音频波文件中自动识别儿童和成人的吞咽声音，准确率在76-95%之间。没有关于早产儿的数据。为了确定使用迁移学习方法应用自动机器学习是否可以准确地识别和区分早产儿吞咽声音中的吞咽声。本研究收集了澳大利亚3家特殊护理机构78名中位出生年龄为34周（25-36周，男性占52.6%）的早产儿的细流吞咽音。对于基础机器学习模型，使用预训练的深度卷积神经网络（DCNN）进行音频事件分类。以原始的燕子音频数据作为输入，生成来自基础DCNN的嵌入向量，并用于训练前馈神经网络，以确定音频片段中是否存在燕子。该模型在识别早产燕子方面显示出很高的总体准确性（94%）。与母乳喂养的燕子（敏感性95%，特异性92%）相比，瓶饲燕子的模型性能更好（敏感性95%，特异性96%）。解释：我们的新研究证明了成功地使用迁移学习来准确识别和分割早产儿的数字吞咽音。该模型的应用可以支持数字CA应用程序的开发，以自动分类吞咽声音，提高CA在特殊护理托儿所临床实践中使用的客观性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Dysphagia 医学-耳鼻喉科学

CiteScore

4.90

自引率

15.40%

发文量

149

审稿时长

6-12 weeks

期刊介绍： Dysphagia aims to serve as a voice for the benefit of the patient. The journal is devoted exclusively to swallowing and its disorders. The purpose of the journal is to provide a source of information to the flourishing dysphagia community. Over the past years, the field of dysphagia has grown rapidly, and the community of dysphagia researchers have galvanized with ambition to represent dysphagia patients. In addition to covering a myriad of disciplines in medicine and speech pathology, the following topics are also covered, but are not limited to: bio-engineering, deglutition, esophageal motility, immunology, and neuro-gastroenterology. The journal aims to foster a growing need for further dysphagia investigation, to disseminate knowledge through research, and to stimulate communication among interested professionals. The journal publishes original papers, technical and instrumental notes, letters to the editor, and review articles.