{"title":"基于英语音素ASR的僧伽罗语和泰米尔语语音意图识别","authors":"Yohan Karunanayake, Uthayasanker Thayasivam, Surangika Ranathunga","doi":"10.1109/IALP48816.2019.9037702","DOIUrl":null,"url":null,"abstract":"Today we can find many use cases for content-based speech classification. These include speech topic identification and spoken command recognition. Automatic Speech Recognition (ASR) sits underneath all of these applications to convert speech into textual format. However, creating an ASR system for a language is a resource-consuming task. Even though there are more than 6000 languages, all of these speech-related applications are limited to the most well-known languages such as English, because of the availability of data. There is some past research that looked into classifying speech while addressing the data scarcity. However, all of these methods have their own limitations. In this paper, we present an English language phoneme based speech intent classification methodology for Sinhala and Tamil languages. We use a pre-trained English ASR model to generate phoneme probability features and use them to identify intents of utterances expressed in Sinhala and Tamil, for which a rather small speech dataset is available. The experiment results show that the proposed method can have more than 80% accuracy for a 0.5-hour limited speech dataset in both languages.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Sinhala and Tamil Speech Intent Identification From English Phoneme Based ASR\",\"authors\":\"Yohan Karunanayake, Uthayasanker Thayasivam, Surangika Ranathunga\",\"doi\":\"10.1109/IALP48816.2019.9037702\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today we can find many use cases for content-based speech classification. These include speech topic identification and spoken command recognition. Automatic Speech Recognition (ASR) sits underneath all of these applications to convert speech into textual format. However, creating an ASR system for a language is a resource-consuming task. Even though there are more than 6000 languages, all of these speech-related applications are limited to the most well-known languages such as English, because of the availability of data. There is some past research that looked into classifying speech while addressing the data scarcity. However, all of these methods have their own limitations. In this paper, we present an English language phoneme based speech intent classification methodology for Sinhala and Tamil languages. We use a pre-trained English ASR model to generate phoneme probability features and use them to identify intents of utterances expressed in Sinhala and Tamil, for which a rather small speech dataset is available. The experiment results show that the proposed method can have more than 80% accuracy for a 0.5-hour limited speech dataset in both languages.\",\"PeriodicalId\":208066,\"journal\":{\"name\":\"2019 International Conference on Asian Language Processing (IALP)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Asian Language Processing (IALP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IALP48816.2019.9037702\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP48816.2019.9037702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sinhala and Tamil Speech Intent Identification From English Phoneme Based ASR
Today we can find many use cases for content-based speech classification. These include speech topic identification and spoken command recognition. Automatic Speech Recognition (ASR) sits underneath all of these applications to convert speech into textual format. However, creating an ASR system for a language is a resource-consuming task. Even though there are more than 6000 languages, all of these speech-related applications are limited to the most well-known languages such as English, because of the availability of data. There is some past research that looked into classifying speech while addressing the data scarcity. However, all of these methods have their own limitations. In this paper, we present an English language phoneme based speech intent classification methodology for Sinhala and Tamil languages. We use a pre-trained English ASR model to generate phoneme probability features and use them to identify intents of utterances expressed in Sinhala and Tamil, for which a rather small speech dataset is available. The experiment results show that the proposed method can have more than 80% accuracy for a 0.5-hour limited speech dataset in both languages.