{"title":"Monophthong vocal tract shapes are sufficient for articulatory synthesis of German primary diphthongs","authors":"Simon Stone, Peter Birkholz","doi":"10.1016/j.specom.2024.103041","DOIUrl":null,"url":null,"abstract":"<div><p><span>German primary diphthongs are conventionally transcribed using the same symbols used for some monophthong vowels. However, if the corresponding vocal tract shapes are used for articulatory synthesis, the results often sound unnatural. Furthermore, there is no clear consensus in the literature if diphthongs have monopthong constituents and if so, which ones. This study therefore analyzed a set of audio recordings from the reference speaker of the state-of-the-art articulatory synthesizer VocalTractLab to identify likely candidates for the monophthong constituents of the German primary diphthongs. We then evaluated these candidates in a listening experiment with naive listeners to determine a </span>naturalness ranking of these candidates and specialized diphthong shapes. The results showed that the German primary diphthongs can indeed be synthesized with no significant loss in naturalness by replacing the specialized diphthong shapes for the initial and final segments by shapes also used for monopthong vowels.</p></div>","PeriodicalId":49485,"journal":{"name":"Speech Communication","volume":"157 ","pages":"Article 103041"},"PeriodicalIF":2.4000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Communication","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016763932400013X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
German primary diphthongs are conventionally transcribed using the same symbols used for some monophthong vowels. However, if the corresponding vocal tract shapes are used for articulatory synthesis, the results often sound unnatural. Furthermore, there is no clear consensus in the literature if diphthongs have monopthong constituents and if so, which ones. This study therefore analyzed a set of audio recordings from the reference speaker of the state-of-the-art articulatory synthesizer VocalTractLab to identify likely candidates for the monophthong constituents of the German primary diphthongs. We then evaluated these candidates in a listening experiment with naive listeners to determine a naturalness ranking of these candidates and specialized diphthong shapes. The results showed that the German primary diphthongs can indeed be synthesized with no significant loss in naturalness by replacing the specialized diphthong shapes for the initial and final segments by shapes also used for monopthong vowels.
期刊介绍:
Speech Communication is an interdisciplinary journal whose primary objective is to fulfil the need for the rapid dissemination and thorough discussion of basic and applied research results.
The journal''s primary objectives are:
• to present a forum for the advancement of human and human-machine speech communication science;
• to stimulate cross-fertilization between different fields of this domain;
• to contribute towards the rapid and wide diffusion of scientifically sound contributions in this domain.