{"title":"Estimation of Speech Features Using a Wearable Inertial Sensor.","authors":"Zuyu Du, Yaodan Xu, Xinsheng Yu, Sen Wang, Lin Xu","doi":"10.1016/j.jvoice.2024.09.012","DOIUrl":null,"url":null,"abstract":"<p><p>Speech features have been investigated as novel digital biomarkers for many psychiatric and neurocognitive diseases. Microphones are the most used devices for speech recording but inevitably suffering from several disadvantages such as privacy leakage and environmental noises, limiting their clinical applications particularly for long-term ambulatory monitoring. The aim of the present study is therefore to explore the feasibility of extracting speech features from the acceleration recorded on the sternum. Ten healthy subjects volunteered in our study. Two speech tasks, that is, repeating one sentence 20 times and reading 20 different sentences, were performed by each subject, with each task repeated eight times under different speech rate and loudness. Voice signals and speech-caused chest vibrations were simultaneously recorded by a microphone and an accelerometer placed on the sternum. Forty-two acoustic features and six time-related prosodic features were extracted from both signals using a standard toolbox, and then compared by a linear fit and correlation analysis. Good agreement between the acceleration features and microphone features is observed in all six time-related prosodic features for both tasks, but only in 19 and 17 acoustic features for task 1 and 2, respectively, with most of them loudness- or pitch-related. Our results suggest the sternum acceleration to track time-related speech prosody, loudness, and pitch very well, demonstrating the feasibility of deriving digital biomarkers from the acceleration signal for diseases strongly related to time-related prosodic and loudness features.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2024.09.012","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Speech features have been investigated as novel digital biomarkers for many psychiatric and neurocognitive diseases. Microphones are the most used devices for speech recording but inevitably suffering from several disadvantages such as privacy leakage and environmental noises, limiting their clinical applications particularly for long-term ambulatory monitoring. The aim of the present study is therefore to explore the feasibility of extracting speech features from the acceleration recorded on the sternum. Ten healthy subjects volunteered in our study. Two speech tasks, that is, repeating one sentence 20 times and reading 20 different sentences, were performed by each subject, with each task repeated eight times under different speech rate and loudness. Voice signals and speech-caused chest vibrations were simultaneously recorded by a microphone and an accelerometer placed on the sternum. Forty-two acoustic features and six time-related prosodic features were extracted from both signals using a standard toolbox, and then compared by a linear fit and correlation analysis. Good agreement between the acceleration features and microphone features is observed in all six time-related prosodic features for both tasks, but only in 19 and 17 acoustic features for task 1 and 2, respectively, with most of them loudness- or pitch-related. Our results suggest the sternum acceleration to track time-related speech prosody, loudness, and pitch very well, demonstrating the feasibility of deriving digital biomarkers from the acceleration signal for diseases strongly related to time-related prosodic and loudness features.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.