Joyshree Chakraborty, Priyankoo Sarmah, K. Samudravijaya
{"title":"四种藏缅语的口语识别","authors":"Joyshree Chakraborty, Priyankoo Sarmah, K. Samudravijaya","doi":"10.1109/O-COCOSDA50338.2020.9295008","DOIUrl":null,"url":null,"abstract":"Bodo, Dimasa, Rabha and Tiwa are languages of the Tibeto-Burman language family. These languages are spoken in north-east India and surrounding areas. Bodo is also one of the 22 official languages of the Government of India. Consequently, spoken language systems had been developed for Bodo. In contrast, similar systems for the other languages are yet to be developed. Here, we present the details of an automatic Language Identification (LID) system that identifies the language of an input speech file without using phonetic information. The text-independent LID system was implemented using Gaussian mixture model with Mel-Frequency Cepstral Coefficients (MFCCs) as features. A 3-fold cross validation methodology was adopted to assess the performance of the system. The accuracy of the LID system was the highest when suprasegmental features were used in addition to segmental features. The best LID system, using a 62-dimensional feature vector consisting of 13 MFCCs and 49 shifted delta coefficients, yields 92.7% accuracy when the duration of the test data is 3 seconds.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spoken Language Identification of Four Tibeto-Burman languages\",\"authors\":\"Joyshree Chakraborty, Priyankoo Sarmah, K. Samudravijaya\",\"doi\":\"10.1109/O-COCOSDA50338.2020.9295008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bodo, Dimasa, Rabha and Tiwa are languages of the Tibeto-Burman language family. These languages are spoken in north-east India and surrounding areas. Bodo is also one of the 22 official languages of the Government of India. Consequently, spoken language systems had been developed for Bodo. In contrast, similar systems for the other languages are yet to be developed. Here, we present the details of an automatic Language Identification (LID) system that identifies the language of an input speech file without using phonetic information. The text-independent LID system was implemented using Gaussian mixture model with Mel-Frequency Cepstral Coefficients (MFCCs) as features. A 3-fold cross validation methodology was adopted to assess the performance of the system. The accuracy of the LID system was the highest when suprasegmental features were used in addition to segmental features. The best LID system, using a 62-dimensional feature vector consisting of 13 MFCCs and 49 shifted delta coefficients, yields 92.7% accuracy when the duration of the test data is 3 seconds.\",\"PeriodicalId\":385266,\"journal\":{\"name\":\"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/O-COCOSDA50338.2020.9295008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spoken Language Identification of Four Tibeto-Burman languages
Bodo, Dimasa, Rabha and Tiwa are languages of the Tibeto-Burman language family. These languages are spoken in north-east India and surrounding areas. Bodo is also one of the 22 official languages of the Government of India. Consequently, spoken language systems had been developed for Bodo. In contrast, similar systems for the other languages are yet to be developed. Here, we present the details of an automatic Language Identification (LID) system that identifies the language of an input speech file without using phonetic information. The text-independent LID system was implemented using Gaussian mixture model with Mel-Frequency Cepstral Coefficients (MFCCs) as features. A 3-fold cross validation methodology was adopted to assess the performance of the system. The accuracy of the LID system was the highest when suprasegmental features were used in addition to segmental features. The best LID system, using a 62-dimensional feature vector consisting of 13 MFCCs and 49 shifted delta coefficients, yields 92.7% accuracy when the duration of the test data is 3 seconds.