Kipras Pribuišis, Rytis Maskeliūnas, Nora Ulozaitė-Stanienė, Evaldas Padervinskis, Robertas Damaševičius, Tomas Blažauskas, Virgilijus Uloza
{"title":"喉癌术后语音增强人工智能驱动语音增强算法的性能评估。","authors":"Kipras Pribuišis, Rytis Maskeliūnas, Nora Ulozaitė-Stanienė, Evaldas Padervinskis, Robertas Damaševičius, Tomas Blažauskas, Virgilijus Uloza","doi":"10.1016/j.jvoice.2025.04.026","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The present study aimed to evaluate the effectiveness of the performance of an AI-driven SpeechEnhancer algorithm speech synthesis following laryngeal oncosurgery.</p><p><strong>Methods: </strong>The original and synthesized speech samples from 77 patients after laryngeal oncosurgery were evaluated in this study. A panel of four experts conducted the auditory-perceptual speech evaluation using the IINFVo and the Similarity Mean Opinion Score (SMOS) scales. The acoustic analysis of speech samples was performed using the Average Voicing Evidence (AVE), Proportion of Voiced Frames (PVF), Proportion of Voiced Speech Frames (PVS) and Acoustic Substitution Voicing Index (ASVI) measures.</p><p><strong>Results: </strong>The synthesized speech samples outperformed the original speech in acoustic and auditory-perceptual evaluation. The mean total IINFVo scores were statistically significantly higher (P < 0.05) in the synthesized speech samples group [IINFVo = 5.59 (SD = 0.83)] when compared with the original speech samples [IINFVo = 4.18 (SD = 1.11)]. The mean SMOS score of 2.42 (SD = 1.19) demonstrated a modest level of similarity between the synthesized and original speech samples. A statistically significant (P < 0.05) improvement of acoustic AVE, PVF, and PVS parameters in synthesized speech samples was observed. The quality of the synthesized speech [ASVI = 19.22 (SD = 7.44)] statistically significantly (P = 0.001) surpassed the original substitution voicing speech quality (ASVI = 9.39 (SD = 4.34).</p><p><strong>Conclusion: </strong>The AI-driven \"SpeechEnhancer\" algorithm is a promising tool for speech rehabilitation after laryngeal oncosurgery. It demonstrates the potential for use in clinical settings by healthcare professionals and patients following laryngeal carcinoma surgery.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessment of the Performance of an AI-Driven SpeechEnhancer Algorithm for Speech Enhancement Following Laryngeal Oncosurgery.\",\"authors\":\"Kipras Pribuišis, Rytis Maskeliūnas, Nora Ulozaitė-Stanienė, Evaldas Padervinskis, Robertas Damaševičius, Tomas Blažauskas, Virgilijus Uloza\",\"doi\":\"10.1016/j.jvoice.2025.04.026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>The present study aimed to evaluate the effectiveness of the performance of an AI-driven SpeechEnhancer algorithm speech synthesis following laryngeal oncosurgery.</p><p><strong>Methods: </strong>The original and synthesized speech samples from 77 patients after laryngeal oncosurgery were evaluated in this study. A panel of four experts conducted the auditory-perceptual speech evaluation using the IINFVo and the Similarity Mean Opinion Score (SMOS) scales. The acoustic analysis of speech samples was performed using the Average Voicing Evidence (AVE), Proportion of Voiced Frames (PVF), Proportion of Voiced Speech Frames (PVS) and Acoustic Substitution Voicing Index (ASVI) measures.</p><p><strong>Results: </strong>The synthesized speech samples outperformed the original speech in acoustic and auditory-perceptual evaluation. The mean total IINFVo scores were statistically significantly higher (P < 0.05) in the synthesized speech samples group [IINFVo = 5.59 (SD = 0.83)] when compared with the original speech samples [IINFVo = 4.18 (SD = 1.11)]. The mean SMOS score of 2.42 (SD = 1.19) demonstrated a modest level of similarity between the synthesized and original speech samples. A statistically significant (P < 0.05) improvement of acoustic AVE, PVF, and PVS parameters in synthesized speech samples was observed. The quality of the synthesized speech [ASVI = 19.22 (SD = 7.44)] statistically significantly (P = 0.001) surpassed the original substitution voicing speech quality (ASVI = 9.39 (SD = 4.34).</p><p><strong>Conclusion: </strong>The AI-driven \\\"SpeechEnhancer\\\" algorithm is a promising tool for speech rehabilitation after laryngeal oncosurgery. It demonstrates the potential for use in clinical settings by healthcare professionals and patients following laryngeal carcinoma surgery.</p>\",\"PeriodicalId\":49954,\"journal\":{\"name\":\"Journal of Voice\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Voice\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jvoice.2025.04.026\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2025.04.026","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
Assessment of the Performance of an AI-Driven SpeechEnhancer Algorithm for Speech Enhancement Following Laryngeal Oncosurgery.
Objective: The present study aimed to evaluate the effectiveness of the performance of an AI-driven SpeechEnhancer algorithm speech synthesis following laryngeal oncosurgery.
Methods: The original and synthesized speech samples from 77 patients after laryngeal oncosurgery were evaluated in this study. A panel of four experts conducted the auditory-perceptual speech evaluation using the IINFVo and the Similarity Mean Opinion Score (SMOS) scales. The acoustic analysis of speech samples was performed using the Average Voicing Evidence (AVE), Proportion of Voiced Frames (PVF), Proportion of Voiced Speech Frames (PVS) and Acoustic Substitution Voicing Index (ASVI) measures.
Results: The synthesized speech samples outperformed the original speech in acoustic and auditory-perceptual evaluation. The mean total IINFVo scores were statistically significantly higher (P < 0.05) in the synthesized speech samples group [IINFVo = 5.59 (SD = 0.83)] when compared with the original speech samples [IINFVo = 4.18 (SD = 1.11)]. The mean SMOS score of 2.42 (SD = 1.19) demonstrated a modest level of similarity between the synthesized and original speech samples. A statistically significant (P < 0.05) improvement of acoustic AVE, PVF, and PVS parameters in synthesized speech samples was observed. The quality of the synthesized speech [ASVI = 19.22 (SD = 7.44)] statistically significantly (P = 0.001) surpassed the original substitution voicing speech quality (ASVI = 9.39 (SD = 4.34).
Conclusion: The AI-driven "SpeechEnhancer" algorithm is a promising tool for speech rehabilitation after laryngeal oncosurgery. It demonstrates the potential for use in clinical settings by healthcare professionals and patients following laryngeal carcinoma surgery.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.