Shumit Saha , Keerthana Viswanathan , Anamika Saha , Azadeh Yadollahi
{"title":"利用元音的声学特征估算清醒时上气道横截面积的可行性:试点研究","authors":"Shumit Saha , Keerthana Viswanathan , Anamika Saha , Azadeh Yadollahi","doi":"10.1016/j.specom.2024.103144","DOIUrl":null,"url":null,"abstract":"<div><div>Assessment of upper airway dimensions has shown great promise in understanding the pathogenesis of obstructive sleep apnea (OSA). However, the current screening system for OSA does not have an objective assessment of the upper airway. The assessment of the upper airway can accurately be performed by MRI or CT scans, which are costly and not easily accessible. Acoustic pharyngometry or Ultrasonography could be less expensive technologies, but these require trained personnel which makes these technologies not easily accessible, especially when assessing the upper airway in a clinic environment or before surgery. In this study, we aimed to investigate the utility of vowel articulation in assessing the upper airway dimension during normal breathing. To accomplish that, we measured the upper airway cross-sectional area (UA-XSA) by acoustic pharyngometry and then asked the participants to produce 5 vowels for 3 s and recorded them with a microphone. We extracted 710 acoustic features from all vowels and compared these features with UA-XSA and developed regression models to estimate the UA-XSA. Our results showed that Mel frequency cepstral coefficients (MFCC) were the most dominant features of vowels, as 7 out of 9 features were from MFCC in the main feature set. The multiple regression analysis showed that the combination of the acoustic features with the anthropometric features achieved an R<sup>2</sup> of 0.80 in estimating UA-XSA. The important advantage of acoustic analysis of vowel sounds is that it is simple and can be easily implemented in wearable devices or mobile applications. Such acoustic-based technologies can be accessible in different clinical settings such as the intensive care unit and can be used in remote areas. Thus, these results could be used to develop user-friendly applications to use the acoustic features and demographical information to estimate the UA-XSA.</div></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":"165 ","pages":"Article 103144"},"PeriodicalIF":2.4000,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feasibility of acoustic features of vowel sounds in estimating the upper airway cross sectional area during wakefulness: A pilot study\",\"authors\":\"Shumit Saha , Keerthana Viswanathan , Anamika Saha , Azadeh Yadollahi\",\"doi\":\"10.1016/j.specom.2024.103144\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Assessment of upper airway dimensions has shown great promise in understanding the pathogenesis of obstructive sleep apnea (OSA). However, the current screening system for OSA does not have an objective assessment of the upper airway. The assessment of the upper airway can accurately be performed by MRI or CT scans, which are costly and not easily accessible. Acoustic pharyngometry or Ultrasonography could be less expensive technologies, but these require trained personnel which makes these technologies not easily accessible, especially when assessing the upper airway in a clinic environment or before surgery. In this study, we aimed to investigate the utility of vowel articulation in assessing the upper airway dimension during normal breathing. To accomplish that, we measured the upper airway cross-sectional area (UA-XSA) by acoustic pharyngometry and then asked the participants to produce 5 vowels for 3 s and recorded them with a microphone. We extracted 710 acoustic features from all vowels and compared these features with UA-XSA and developed regression models to estimate the UA-XSA. Our results showed that Mel frequency cepstral coefficients (MFCC) were the most dominant features of vowels, as 7 out of 9 features were from MFCC in the main feature set. The multiple regression analysis showed that the combination of the acoustic features with the anthropometric features achieved an R<sup>2</sup> of 0.80 in estimating UA-XSA. The important advantage of acoustic analysis of vowel sounds is that it is simple and can be easily implemented in wearable devices or mobile applications. Such acoustic-based technologies can be accessible in different clinical settings such as the intensive care unit and can be used in remote areas. Thus, these results could be used to develop user-friendly applications to use the acoustic features and demographical information to estimate the UA-XSA.</div></div>\",\"PeriodicalId\":49485,\"journal\":{\"name\":\"Speech Communication\",\"volume\":\"165 \",\"pages\":\"Article 103144\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Speech Communication\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167639324001158\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Communication","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167639324001158","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
Feasibility of acoustic features of vowel sounds in estimating the upper airway cross sectional area during wakefulness: A pilot study
Assessment of upper airway dimensions has shown great promise in understanding the pathogenesis of obstructive sleep apnea (OSA). However, the current screening system for OSA does not have an objective assessment of the upper airway. The assessment of the upper airway can accurately be performed by MRI or CT scans, which are costly and not easily accessible. Acoustic pharyngometry or Ultrasonography could be less expensive technologies, but these require trained personnel which makes these technologies not easily accessible, especially when assessing the upper airway in a clinic environment or before surgery. In this study, we aimed to investigate the utility of vowel articulation in assessing the upper airway dimension during normal breathing. To accomplish that, we measured the upper airway cross-sectional area (UA-XSA) by acoustic pharyngometry and then asked the participants to produce 5 vowels for 3 s and recorded them with a microphone. We extracted 710 acoustic features from all vowels and compared these features with UA-XSA and developed regression models to estimate the UA-XSA. Our results showed that Mel frequency cepstral coefficients (MFCC) were the most dominant features of vowels, as 7 out of 9 features were from MFCC in the main feature set. The multiple regression analysis showed that the combination of the acoustic features with the anthropometric features achieved an R2 of 0.80 in estimating UA-XSA. The important advantage of acoustic analysis of vowel sounds is that it is simple and can be easily implemented in wearable devices or mobile applications. Such acoustic-based technologies can be accessible in different clinical settings such as the intensive care unit and can be used in remote areas. Thus, these results could be used to develop user-friendly applications to use the acoustic features and demographical information to estimate the UA-XSA.
期刊介绍:
Speech Communication is an interdisciplinary journal whose primary objective is to fulfil the need for the rapid dissemination and thorough discussion of basic and applied research results.
The journal''s primary objectives are:
• to present a forum for the advancement of human and human-machine speech communication science;
• to stimulate cross-fertilization between different fields of this domain;
• to contribute towards the rapid and wide diffusion of scientifically sound contributions in this domain.