Stephen So, Timothy Tadj, Belinda Schwerin, Anne B Chang, Seiji Humphries, Thuy T Frakking
{"title":"Using Machine Learning for the Automated Segmentation and Detection of Swallows Obtained by Digital Cervical Auscultation in Preterm Neonates.","authors":"Stephen So, Timothy Tadj, Belinda Schwerin, Anne B Chang, Seiji Humphries, Thuy T Frakking","doi":"10.1007/s00455-025-10879-3","DOIUrl":null,"url":null,"abstract":"<p><p>The clinical application of acoustic swallowing sound parameters collected from digital cervical auscultation is limited because of the time-consuming manual segmentation required by trained experts. The automated identification of swallowing sounds in children and adults from swallowing sound audio wavefiles using machine learning have accuracies between 76-95%. No data exists in preterm neonates. To determine if applying automated machine learning using a transfer learning approach could accurately identify and segment swallows from swallowing sounds collected in preterm neonates. Thin fluid swallow sounds were collected from 78 preterm neonates, median birth age 34 weeks gestation (range 25-36 weeks, 52.6% males) across 3 Australian special care nurseries. For the base machine learning model, a deep convolutional neural network (DCNN) pre-trained for audio event classification was used. With raw swallow audio data as input, embedding vectors from the base DCNN were generated and used to train a feedforward neural network to determine the presence of a swallow within an audio segment. The model showed high overall accuracy (94%) in identifying preterm swallows. Better model performance on bottle feeding swallows (Sensitivity, 95%; and specificity, 96%) was seen compared with breastfeeding swallows (sensitivity, 95%, specificity 92%). Interpretation: Our novel study demonstrates the successful use of transfer learning to accurately identify and segment digital swallowing sounds in preterm neonates. Application of this model could support the development of a digital CA app to automatically classify swallow sounds and improve objectivity for CA use in clinical practice within special care nurseries.</p>","PeriodicalId":11508,"journal":{"name":"Dysphagia","volume":" ","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dysphagia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00455-025-10879-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The clinical application of acoustic swallowing sound parameters collected from digital cervical auscultation is limited because of the time-consuming manual segmentation required by trained experts. The automated identification of swallowing sounds in children and adults from swallowing sound audio wavefiles using machine learning have accuracies between 76-95%. No data exists in preterm neonates. To determine if applying automated machine learning using a transfer learning approach could accurately identify and segment swallows from swallowing sounds collected in preterm neonates. Thin fluid swallow sounds were collected from 78 preterm neonates, median birth age 34 weeks gestation (range 25-36 weeks, 52.6% males) across 3 Australian special care nurseries. For the base machine learning model, a deep convolutional neural network (DCNN) pre-trained for audio event classification was used. With raw swallow audio data as input, embedding vectors from the base DCNN were generated and used to train a feedforward neural network to determine the presence of a swallow within an audio segment. The model showed high overall accuracy (94%) in identifying preterm swallows. Better model performance on bottle feeding swallows (Sensitivity, 95%; and specificity, 96%) was seen compared with breastfeeding swallows (sensitivity, 95%, specificity 92%). Interpretation: Our novel study demonstrates the successful use of transfer learning to accurately identify and segment digital swallowing sounds in preterm neonates. Application of this model could support the development of a digital CA app to automatically classify swallow sounds and improve objectivity for CA use in clinical practice within special care nurseries.
期刊介绍:
Dysphagia aims to serve as a voice for the benefit of the patient. The journal is devoted exclusively to swallowing and its disorders. The purpose of the journal is to provide a source of information to the flourishing dysphagia community. Over the past years, the field of dysphagia has grown rapidly, and the community of dysphagia researchers have galvanized with ambition to represent dysphagia patients. In addition to covering a myriad of disciplines in medicine and speech pathology, the following topics are also covered, but are not limited to: bio-engineering, deglutition, esophageal motility, immunology, and neuro-gastroenterology. The journal aims to foster a growing need for further dysphagia investigation, to disseminate knowledge through research, and to stimulate communication among interested professionals. The journal publishes original papers, technical and instrumental notes, letters to the editor, and review articles.