Energy-Efficient Audio Processing at the Edge for Biologging Applications

IF 1.6 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
Jonathan Miquel, L. Latorre, S. Chamaillé‐Jammes
{"title":"Energy-Efficient Audio Processing at the Edge for Biologging Applications","authors":"Jonathan Miquel, L. Latorre, S. Chamaillé‐Jammes","doi":"10.3390/jlpea13020030","DOIUrl":null,"url":null,"abstract":"Biologging refers to the use of animal-borne recording devices to study wildlife behavior. In the case of audio recording, such devices generate large amounts of data over several months, and thus require some level of processing automation for the raw data collected. Academics have widely adopted offline deep-learning-classification algorithms to extract meaningful information from large datasets, mainly using time-frequency signal representations such as spectrograms. Because of the high deployment costs of animal-borne devices, the autonomy/weight ratio remains by far the fundamental concern. Basically, power consumption is addressed using onboard mass storage (no wireless transmission), yet the energy cost associated with data storage activity is far from negligible. In this paper, we evaluate various strategies to reduce the amount of stored data, making the fair assumption that audio will be categorized using a deep-learning classifier at some point of the process. This assumption opens up several scenarios, from straightforward raw audio storage paired with further offline classification on one side, to a fully embedded AI engine on the other side, with embedded audio compression or feature extraction in between. This paper investigates three approaches focusing on data-dimension reduction: (i) traditional inline audio compression, namely ADPCM and MP3, (ii) full deep-learning classification at the edge, and (iii) embedded pre-processing that only computes and stores spectrograms for later offline classification. We characterized each approach in terms of total (sensor + CPU + mass-storage) edge power consumption (i.e., recorder autonomy) and classification accuracy. Our results demonstrate that ADPCM encoding brings 17.6% energy savings compared to the baseline system (i.e., uncompressed raw audio samples). Using such compressed data, a state-of-the-art spectrogram-based classification model still achieves 91.25% accuracy on open speech datasets. Performing inline data-preparation can significantly reduce the amount of stored data allowing for a 19.8% energy saving compared to the baseline system, while still achieving 89% accuracy during classification. These results show that while massive data reduction can be achieved through the use of inline computation of spectrograms, it translates to little benefit on device autonomy when compared to ADPCM encoding, with the added downside of losing original audio information.","PeriodicalId":38100,"journal":{"name":"Journal of Low Power Electronics and Applications","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Low Power Electronics and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jlpea13020030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Biologging refers to the use of animal-borne recording devices to study wildlife behavior. In the case of audio recording, such devices generate large amounts of data over several months, and thus require some level of processing automation for the raw data collected. Academics have widely adopted offline deep-learning-classification algorithms to extract meaningful information from large datasets, mainly using time-frequency signal representations such as spectrograms. Because of the high deployment costs of animal-borne devices, the autonomy/weight ratio remains by far the fundamental concern. Basically, power consumption is addressed using onboard mass storage (no wireless transmission), yet the energy cost associated with data storage activity is far from negligible. In this paper, we evaluate various strategies to reduce the amount of stored data, making the fair assumption that audio will be categorized using a deep-learning classifier at some point of the process. This assumption opens up several scenarios, from straightforward raw audio storage paired with further offline classification on one side, to a fully embedded AI engine on the other side, with embedded audio compression or feature extraction in between. This paper investigates three approaches focusing on data-dimension reduction: (i) traditional inline audio compression, namely ADPCM and MP3, (ii) full deep-learning classification at the edge, and (iii) embedded pre-processing that only computes and stores spectrograms for later offline classification. We characterized each approach in terms of total (sensor + CPU + mass-storage) edge power consumption (i.e., recorder autonomy) and classification accuracy. Our results demonstrate that ADPCM encoding brings 17.6% energy savings compared to the baseline system (i.e., uncompressed raw audio samples). Using such compressed data, a state-of-the-art spectrogram-based classification model still achieves 91.25% accuracy on open speech datasets. Performing inline data-preparation can significantly reduce the amount of stored data allowing for a 19.8% energy saving compared to the baseline system, while still achieving 89% accuracy during classification. These results show that while massive data reduction can be achieved through the use of inline computation of spectrograms, it translates to little benefit on device autonomy when compared to ADPCM encoding, with the added downside of losing original audio information.
生物记录应用的边缘高效音频处理
生物学是指使用动物传播的记录设备来研究野生动物的行为。在音频记录的情况下,这些设备在几个月内生成大量数据,因此需要对收集的原始数据进行一定程度的自动化处理。学术界广泛采用离线深度学习分类算法从大型数据集中提取有意义的信息,主要使用谱图等时频信号表示。由于动物携带设备的高部署成本,自主性/重量比仍然是迄今为止最基本的问题。基本上,功耗是通过板载大容量存储(没有无线传输)来解决的,但与数据存储活动相关的能源成本远非可以忽略不计。在本文中,我们评估了各种减少存储数据量的策略,并公平地假设音频将在该过程的某个时刻使用深度学习分类器进行分类。这个假设打开了几种场景,从直接的原始音频存储与进一步的离线分类,到完全嵌入的AI引擎,以及嵌入的音频压缩或特征提取。本文研究了三种专注于数据降维的方法:(i)传统的内联音频压缩,即ADPCM和MP3, (ii)边缘的完全深度学习分类,以及(iii)仅计算和存储频谱图以供以后离线分类的嵌入式预处理。我们根据总(传感器+ CPU +大容量存储)边缘功耗(即记录器自主性)和分类准确性来描述每种方法。我们的结果表明,与基线系统(即未压缩的原始音频样本)相比,ADPCM编码节省了17.6%的能源。使用这样的压缩数据,最先进的基于谱图的分类模型在开放语音数据集上仍然达到91.25%的准确率。与基线系统相比,执行内联数据准备可以显著减少存储的数据量,从而节省19.8%的能源,同时在分类过程中仍能达到89%的准确率。这些结果表明,虽然通过使用谱图的内联计算可以实现大量的数据减少,但与ADPCM编码相比,它在设备自主性方面几乎没有什么好处,而且还有丢失原始音频信息的缺点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Low Power Electronics and Applications
Journal of Low Power Electronics and Applications Engineering-Electrical and Electronic Engineering
CiteScore
3.60
自引率
14.30%
发文量
57
审稿时长
11 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信