Environmental sound recognition on embedded devices using deep learning: a review

IF 10.7 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Pau Gairí, Tomàs Pallejà, Marcel Tresanchez
{"title":"Environmental sound recognition on embedded devices using deep learning: a review","authors":"Pau Gairí,&nbsp;Tomàs Pallejà,&nbsp;Marcel Tresanchez","doi":"10.1007/s10462-025-11106-z","DOIUrl":null,"url":null,"abstract":"<div><p>Sound recognition has a wide range of applications beyond speech and music, including environmental monitoring, sound source classification, mechanical fault diagnosis, audio fingerprinting, and event detection. These applications often require real-time data processing, making them well-suited for embedded systems. However, embedded devices face significant challenges due to limited computational power, memory, and low power consumption. Despite these constraints, achieving high performance in environmental sound recognition typically requires complex algorithms. Deep Learning models have demonstrated high accuracy on existing datasets, making them a popular choice for such tasks. However, these models are resource-intensive, posing challenges for real-time edge applications. This paper presents a comprehensive review of integrating Deep Learning models into embedded systems, examining their state-of-the-art applications, key components, and steps involved. It also explores strategies to optimise performance in resource-constrained environments through a comparison of various implementation approaches such as knowledge distillation, pruning, and quantization, with studies achieving a reduction in complexity of up to 97% compared to the unoptimized model. Overall, we conclude that in spite of the availability of lightweight deep learning models, input features, and compression techniques, their integration into low-resource devices, such as microcontrollers, remains limited. Furthermore, more complex tasks, such as general sound classification, especially with expanded frequency bands and real-time operation have yet to be effectively implemented on these devices. These findings highlight the need for a standardised research framework to evaluate these technologies applied to resource-constrained devices, and for further development to realise the wide range of potential applications.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 6","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11106-z.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11106-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Sound recognition has a wide range of applications beyond speech and music, including environmental monitoring, sound source classification, mechanical fault diagnosis, audio fingerprinting, and event detection. These applications often require real-time data processing, making them well-suited for embedded systems. However, embedded devices face significant challenges due to limited computational power, memory, and low power consumption. Despite these constraints, achieving high performance in environmental sound recognition typically requires complex algorithms. Deep Learning models have demonstrated high accuracy on existing datasets, making them a popular choice for such tasks. However, these models are resource-intensive, posing challenges for real-time edge applications. This paper presents a comprehensive review of integrating Deep Learning models into embedded systems, examining their state-of-the-art applications, key components, and steps involved. It also explores strategies to optimise performance in resource-constrained environments through a comparison of various implementation approaches such as knowledge distillation, pruning, and quantization, with studies achieving a reduction in complexity of up to 97% compared to the unoptimized model. Overall, we conclude that in spite of the availability of lightweight deep learning models, input features, and compression techniques, their integration into low-resource devices, such as microcontrollers, remains limited. Furthermore, more complex tasks, such as general sound classification, especially with expanded frequency bands and real-time operation have yet to be effectively implemented on these devices. These findings highlight the need for a standardised research framework to evaluate these technologies applied to resource-constrained devices, and for further development to realise the wide range of potential applications.

利用深度学习识别嵌入式设备上的环境声音:综述
除语音和音乐外,声音识别还具有广泛的应用领域,包括环境监测、声源分类、机械故障诊断、音频指纹识别和事件检测。这些应用通常需要实时数据处理,因此非常适合嵌入式系统。然而,由于计算能力、内存和低功耗有限,嵌入式设备面临着巨大的挑战。尽管存在这些限制,但要实现环境声音识别的高性能,通常需要复杂的算法。深度学习模型已在现有数据集上表现出很高的准确性,因此成为此类任务的热门选择。然而,这些模型是资源密集型的,给实时边缘应用带来了挑战。本文全面回顾了将深度学习模型集成到嵌入式系统中的情况,研究了它们的最新应用、关键组件和相关步骤。本文还通过对知识提炼、剪枝和量化等各种实现方法的比较,探讨了在资源受限环境中优化性能的策略,研究结果表明,与未优化的模型相比,复杂度最多可降低 97%。总之,我们得出的结论是,尽管存在轻量级深度学习模型、输入特征和压缩技术,但将它们集成到微控制器等低资源设备中仍然受到限制。此外,更复杂的任务,如一般声音分类,特别是扩大频带和实时操作,尚未在这些设备上有效实施。这些发现突出表明,需要一个标准化的研究框架来评估这些应用于资源受限设备的技术,并进一步开发以实现广泛的潜在应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence Review
Artificial Intelligence Review 工程技术-计算机:人工智能
CiteScore
22.00
自引率
3.30%
发文量
194
审稿时长
5.3 months
期刊介绍: Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信