{"title":"Identifying Low-Resource Languages in Speech Recordings through Deep Learning","authors":"Kleona Binjaku, Joan Janku, E. Meçe","doi":"10.23919/softcom55329.2022.9911376","DOIUrl":null,"url":null,"abstract":"The aim of this paper is to build a system that identifies a low resource language, like the Albanian language, in speech recordings. Our proposed system is based on the conversion of audio signals into spectrograms. We have built 2 models for the identification of spoken language based on spectrograms images using Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN). The dataset with spoken audio signals in the Albanian language, we have built manually. The results are taken based on two languages, but the system works if other languages are added. Both models have shown good capabilities to learn Albanian language patterns from spectrograms and the achieved accuracies are 85% (ANN) and 94% (CNN) respectively. We have studied different cases how spectrograms' color and size impact the performance of our models.","PeriodicalId":261625,"journal":{"name":"2022 International Conference on Software, Telecommunications and Computer Networks (SoftCOM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Software, Telecommunications and Computer Networks (SoftCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/softcom55329.2022.9911376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The aim of this paper is to build a system that identifies a low resource language, like the Albanian language, in speech recordings. Our proposed system is based on the conversion of audio signals into spectrograms. We have built 2 models for the identification of spoken language based on spectrograms images using Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN). The dataset with spoken audio signals in the Albanian language, we have built manually. The results are taken based on two languages, but the system works if other languages are added. Both models have shown good capabilities to learn Albanian language patterns from spectrograms and the achieved accuracies are 85% (ANN) and 94% (CNN) respectively. We have studied different cases how spectrograms' color and size impact the performance of our models.