Challenges with Audio Classification using Image Based Approaches for Health Measurement Applications

2020 IEEE International Symposium on Medical Measurements and Applications (MeMeA) Pub Date : 2020-06-01 DOI:10.1109/MeMeA49120.2020.9137254

Madison Cohen-McFarlane, R. Goubran, Bruce Wallace

{"title":"Challenges with Audio Classification using Image Based Approaches for Health Measurement Applications","authors":"Madison Cohen-McFarlane, R. Goubran, Bruce Wallace","doi":"10.1109/MeMeA49120.2020.9137254","DOIUrl":null,"url":null,"abstract":"Image classification has had huge success in recent years, mainly due to the vast array of databases available. The lack of audio databases presents a problem when it comes to creating a deep neural network classifier aimed at measurement and monitoring of health-related sounds. Such sounds (i.e. cough) can be indicative of worsening health conditions, specifically as it relates to remote monitoring of older adults. The application of pre-existing deep neural network image classifiers to audio classification has been presented as a potential solution. This paper describes some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring. The spatial invariance assumption of the classifier is further investigated by creating two different classification tasks based on spectrograms computed from notes on a classical piano at four different noise levels; (1) octave classification and (2) note classification. As expected, the AlexNet classifier with clean data performs better when classifying octaves (98%), when compared to the note classification (83 %). When evaluating on audio with noise, the note classifier performance decreases more than the octave classification performance.","PeriodicalId":152478,"journal":{"name":"2020 IEEE International Symposium on Medical Measurements and Applications (MeMeA)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Medical Measurements and Applications (MeMeA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MeMeA49120.2020.9137254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Image classification has had huge success in recent years, mainly due to the vast array of databases available. The lack of audio databases presents a problem when it comes to creating a deep neural network classifier aimed at measurement and monitoring of health-related sounds. Such sounds (i.e. cough) can be indicative of worsening health conditions, specifically as it relates to remote monitoring of older adults. The application of pre-existing deep neural network image classifiers to audio classification has been presented as a potential solution. This paper describes some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring. The spatial invariance assumption of the classifier is further investigated by creating two different classification tasks based on spectrograms computed from notes on a classical piano at four different noise levels; (1) octave classification and (2) note classification. As expected, the AlexNet classifier with clean data performs better when classifying octaves (98%), when compared to the note classification (83 %). When evaluating on audio with noise, the note classifier performance decreases more than the octave classification performance.

查看原文本刊更多论文

基于图像的音频分类在健康测量应用中的挑战

近年来，图像分类取得了巨大的成功，这主要归功于大量可用的数据库。在创建旨在测量和监测与健康有关的声音的深度神经网络分类器时，缺乏音频数据库提出了一个问题。这种声音(如咳嗽)可能表明健康状况恶化，特别是因为它与老年人的远程监测有关。将已有的深度神经网络图像分类器应用于音频分类是一种潜在的解决方案。本文描述了与利用音频频谱图重新训练AlexNet图像分类器以实现远程患者监测相关的一些问题。通过创建两种不同的分类任务，进一步研究了分类器的空间不变性假设，该任务基于四种不同噪声水平下古典钢琴音符的谱图计算;(1)八度分类和(2)音符分类。正如预期的那样，与音符分类(83%)相比，具有干净数据的AlexNet分类器在对八度进行分类时表现更好(98%)。当对带有噪声的音频进行评价时，音符分类器的性能下降幅度大于八度分类器的性能下降幅度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Symposium on Medical Measurements and Applications (MeMeA)

自引率

0.00%

发文量