{"title":"Individuality-Preserving Voice Reconstruction for Articulation Disorders Using Text-to-Speech Synthesis","authors":"Reina Ueda, T. Takiguchi, Y. Ariki","doi":"10.1145/2818346.2820770","DOIUrl":null,"url":null,"abstract":"This paper presents a speech synthesis method for people with articulation disorders. Because the movements of such speakers are limited by their athetoid symptoms, their prosody is often unstable and their speech rate differs from that of a physically unimpaired person, which causes their speech to be less intelligible and, consequently, makes communication with physically unimpaired persons difficult. In order to deal with these problems, this paper describes a Hidden Markov Model(HMM)-based text-to-speech synthesis approach that preserves the individuality of a person with an articulation disorder and aids them in their communication. In our method, a duration model of a physically unimpaired person is used for the HMM synthesis system and an F0 model in the system is trained using the F0 patterns of the physically unimpaired person, with the average F0 being converted to the target F0 in advance. In order to preserve the target speaker's individuality, a spectral model is built from target spectra. Through experimental evaluations, we have confirmed that the proposed method successfully synthesizes intelligible speech while maintaining the target speaker's individuality.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2818346.2820770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This paper presents a speech synthesis method for people with articulation disorders. Because the movements of such speakers are limited by their athetoid symptoms, their prosody is often unstable and their speech rate differs from that of a physically unimpaired person, which causes their speech to be less intelligible and, consequently, makes communication with physically unimpaired persons difficult. In order to deal with these problems, this paper describes a Hidden Markov Model(HMM)-based text-to-speech synthesis approach that preserves the individuality of a person with an articulation disorder and aids them in their communication. In our method, a duration model of a physically unimpaired person is used for the HMM synthesis system and an F0 model in the system is trained using the F0 patterns of the physically unimpaired person, with the average F0 being converted to the target F0 in advance. In order to preserve the target speaker's individuality, a spectral model is built from target spectra. Through experimental evaluations, we have confirmed that the proposed method successfully synthesizes intelligible speech while maintaining the target speaker's individuality.