基于卷积密集神经网络的呼吸量测量变量FVC持续发声预测

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2021-10-25 DOI:10.1109/mlsp52302.2021.9596159

Shivani Yadav, D. Gope, U. Krishnaswamy, P. Ghosh

{"title":"基于卷积密集神经网络的呼吸量测量变量FVC持续发声预测","authors":"Shivani Yadav, D. Gope, U. Krishnaswamy, P. Ghosh","doi":"10.1109/mlsp52302.2021.9596159","DOIUrl":null,"url":null,"abstract":"Spirometry is a lung function test used to diagnose and monitor lung diseases like asthma, pneumonia, chronic obstructive pulmonary disease, etc. Spirometry measures forced vital capacity (FVC), forced expiratory volume in 1 sec (FEV1), and their ratio to determine lung health. Spirometry is very time-consuming, strenuous, and requires proper training. Alternate methods based on voice for diagnosis and monitoring of lung health are promising because they are faster, easy to do, and require minimal training. Non-speech sounds, namely, cough and wheeze, have been used to predict spirometry variables, but the role of speech sounds that occur in natural speaking for a similar task has not been explored. In this work, the spirometry variable, FVC has been predicted from sustained phonations of vowel sounds using a convolutional dense neural network (CDNN). Mel-spectrogram has been used as a feature. An experiment conducted using 160 subjects indicates, /i/ is the best sound and /u:/ is worst for the prediction task with an average Mean Absolute Error of 0.67l(±. 07l) and 0.70l(± 0.13l) among all sustained phonations of vowels sounds considered in this work.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"284 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Convolutional Dense Neural Network Based Spirometry Variable FVC Prediction Using Sustained Phonations\",\"authors\":\"Shivani Yadav, D. Gope, U. Krishnaswamy, P. Ghosh\",\"doi\":\"10.1109/mlsp52302.2021.9596159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spirometry is a lung function test used to diagnose and monitor lung diseases like asthma, pneumonia, chronic obstructive pulmonary disease, etc. Spirometry measures forced vital capacity (FVC), forced expiratory volume in 1 sec (FEV1), and their ratio to determine lung health. Spirometry is very time-consuming, strenuous, and requires proper training. Alternate methods based on voice for diagnosis and monitoring of lung health are promising because they are faster, easy to do, and require minimal training. Non-speech sounds, namely, cough and wheeze, have been used to predict spirometry variables, but the role of speech sounds that occur in natural speaking for a similar task has not been explored. In this work, the spirometry variable, FVC has been predicted from sustained phonations of vowel sounds using a convolutional dense neural network (CDNN). Mel-spectrogram has been used as a feature. An experiment conducted using 160 subjects indicates, /i/ is the best sound and /u:/ is worst for the prediction task with an average Mean Absolute Error of 0.67l(±. 07l) and 0.70l(± 0.13l) among all sustained phonations of vowels sounds considered in this work.\",\"PeriodicalId\":156116,\"journal\":{\"name\":\"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)\",\"volume\":\"284 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/mlsp52302.2021.9596159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mlsp52302.2021.9596159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

肺活量测定是一种肺功能测试，用于诊断和监测肺部疾病，如哮喘、肺炎、慢性阻塞性肺疾病等。肺活量测定法通过测定用力肺活量(FVC)、1秒用力呼气量(FEV1)及其比值来判断肺部健康状况。肺活量测定法非常耗时，费力，需要适当的训练。基于声音的诊断和监测肺部健康的替代方法很有前途，因为它们更快、容易操作，并且需要最少的培训。非言语声音，即咳嗽和喘息，已被用于预测肺活量变量，但在类似任务中自然说话时出现的语音的作用尚未被探索。在这项工作中，使用卷积密集神经网络(CDNN)从元音的持续发声中预测肺活量变量FVC。用mel谱图作为特征。对160名受试者进行的实验表明，/i/是预测任务的最佳声音，/u:/最差，平均平均绝对误差为0.67l(±。07l)和0.70l(±0.13l)在本研究中考虑的所有元音的持续发声中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Convolutional Dense Neural Network Based Spirometry Variable FVC Prediction Using Sustained Phonations

Spirometry is a lung function test used to diagnose and monitor lung diseases like asthma, pneumonia, chronic obstructive pulmonary disease, etc. Spirometry measures forced vital capacity (FVC), forced expiratory volume in 1 sec (FEV1), and their ratio to determine lung health. Spirometry is very time-consuming, strenuous, and requires proper training. Alternate methods based on voice for diagnosis and monitoring of lung health are promising because they are faster, easy to do, and require minimal training. Non-speech sounds, namely, cough and wheeze, have been used to predict spirometry variables, but the role of speech sounds that occur in natural speaking for a similar task has not been explored. In this work, the spirometry variable, FVC has been predicted from sustained phonations of vowel sounds using a convolutional dense neural network (CDNN). Mel-spectrogram has been used as a feature. An experiment conducted using 160 subjects indicates, /i/ is the best sound and /u:/ is worst for the prediction task with an average Mean Absolute Error of 0.67l(±. 07l) and 0.70l(± 0.13l) among all sustained phonations of vowels sounds considered in this work.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)

自引率

0.00%

发文量