Evaluating Convolutional Neural Networks and Vision Transformers for Baby Cry Sound Analysis

Future Internet Pub Date : 2024-07-07 DOI:10.3390/fi16070242
Samir A. Younis, Dalia Sobhy, Noha S. Tawfik
{"title":"Evaluating Convolutional Neural Networks and Vision Transformers for Baby Cry Sound Analysis","authors":"Samir A. Younis, Dalia Sobhy, Noha S. Tawfik","doi":"10.3390/fi16070242","DOIUrl":null,"url":null,"abstract":"Crying is a newborn’s main way of communicating. Despite their apparent similarity, newborn cries are physically generated and have distinct characteristics. Experienced medical professionals, nurses, and parents are able to recognize these variations based on their prior interactions. Nonetheless, interpreting a baby’s cries can be challenging for carers, first-time parents, and inexperienced paediatricians. This paper uses advanced deep learning techniques to propose a novel approach for baby cry classification. This study aims to accurately classify different cry types associated with everyday infant needs, including hunger, discomfort, pain, tiredness, and the need for burping. The proposed model achieves an accuracy of 98.33%, surpassing the performance of existing studies in the field. IoT-enabled sensors are utilized to capture cry signals in real time, ensuring continuous and reliable monitoring of the infant’s acoustic environment. This integration of IoT technology with deep learning enhances the system’s responsiveness and accuracy. Our study highlights the significance of accurate cry classification in understanding and meeting the needs of infants and its potential impact on improving infant care practices. The methodology, including the dataset, preprocessing techniques, and architecture of the deep learning model, is described. The results demonstrate the performance of the proposed model, and the discussion analyzes the factors contributing to its high accuracy.","PeriodicalId":509567,"journal":{"name":"Future Internet","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/fi16070242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Crying is a newborn’s main way of communicating. Despite their apparent similarity, newborn cries are physically generated and have distinct characteristics. Experienced medical professionals, nurses, and parents are able to recognize these variations based on their prior interactions. Nonetheless, interpreting a baby’s cries can be challenging for carers, first-time parents, and inexperienced paediatricians. This paper uses advanced deep learning techniques to propose a novel approach for baby cry classification. This study aims to accurately classify different cry types associated with everyday infant needs, including hunger, discomfort, pain, tiredness, and the need for burping. The proposed model achieves an accuracy of 98.33%, surpassing the performance of existing studies in the field. IoT-enabled sensors are utilized to capture cry signals in real time, ensuring continuous and reliable monitoring of the infant’s acoustic environment. This integration of IoT technology with deep learning enhances the system’s responsiveness and accuracy. Our study highlights the significance of accurate cry classification in understanding and meeting the needs of infants and its potential impact on improving infant care practices. The methodology, including the dataset, preprocessing techniques, and architecture of the deep learning model, is described. The results demonstrate the performance of the proposed model, and the discussion analyzes the factors contributing to its high accuracy.
评估用于婴儿哭声分析的卷积神经网络和视觉变换器
啼哭是新生儿沟通的主要方式。尽管表面上看起来很相似,但新生儿的哭声是由身体产生的,并具有明显的特征。经验丰富的专业医务人员、护士和父母能够根据以往的互动经验识别这些差异。然而,对于护理人员、初次为人父母者和缺乏经验的儿科医生来说,解读婴儿的哭声可能具有挑战性。本文采用先进的深度学习技术,提出了一种新颖的婴儿哭声分类方法。本研究旨在准确分类与婴儿日常需求相关的不同哭声类型,包括饥饿、不适、疼痛、疲倦和打嗝需求。所提出的模型准确率达到 98.33%,超过了该领域现有研究的表现。利用物联网传感器实时捕捉哭声信号,确保对婴儿的声学环境进行持续、可靠的监测。物联网技术与深度学习的结合提高了系统的响应速度和准确性。我们的研究强调了准确的哭声分类对理解和满足婴儿需求的重要意义,及其对改善婴儿护理实践的潜在影响。研究方法包括数据集、预处理技术和深度学习模型的架构。结果展示了所提模型的性能,讨论分析了导致其高精度的因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信