{"title":"Language independent and language adaptive large vocabulary speech recognition","authors":"Tanja Schultz, A. Waibel","doi":"10.21437/ICSLP.1998-751","DOIUrl":null,"url":null,"abstract":"This paper describes the design of a multilingual speech recognizer using an LVCSR dictation database which has been collected under the project GlobalPhone. This project at the University of Karlsruhe investigates LVCSR systems in 15 languages of the world, namely Arabic, Chinese, Croatian, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Swedish, Tamil, and Turkish. Based on a global phoneme set we built different multilingual speech recognition systems for five of the 15 languages. Context dependent phoneme models are created data-driven by introducing questions about language and language groups to our polyphone clustering procedure. We apply the resulting multilingual models to unseen languages and present several recognition results in language independent and language adaptive setups.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"47 30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"90","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 90
Abstract
This paper describes the design of a multilingual speech recognizer using an LVCSR dictation database which has been collected under the project GlobalPhone. This project at the University of Karlsruhe investigates LVCSR systems in 15 languages of the world, namely Arabic, Chinese, Croatian, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Swedish, Tamil, and Turkish. Based on a global phoneme set we built different multilingual speech recognition systems for five of the 15 languages. Context dependent phoneme models are created data-driven by introducing questions about language and language groups to our polyphone clustering procedure. We apply the resulting multilingual models to unseen languages and present several recognition results in language independent and language adaptive setups.