Andrei-Marius Avram, M. Nichita, Razvan Bartusica, Madalin Mihai
{"title":"RoSAC: A Speech Corpus for Transcribing Romanian Emergency Calls","authors":"Andrei-Marius Avram, M. Nichita, Razvan Bartusica, Madalin Mihai","doi":"10.1109/comm54429.2022.9817214","DOIUrl":null,"url":null,"abstract":"Publicly available speech datasets for Romanian are still scarce, being far from enough for obtaining state-of-the-art performance with modern deep neural networks. As a response to this issue, during the development of an internal automatic speech recognition system for the national emergency call center, we have created the Romanian Speech Alert Corpus, a new Romanian speech corpus that was obtained by crowd-sourcing the reading of sentences in our institution. This paper describes the data acquisition process, several statistics about the resulted corpus and the algorithm we employed for removing the inadequate recordings.","PeriodicalId":118077,"journal":{"name":"2022 14th International Conference on Communications (COMM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Communications (COMM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/comm54429.2022.9817214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Publicly available speech datasets for Romanian are still scarce, being far from enough for obtaining state-of-the-art performance with modern deep neural networks. As a response to this issue, during the development of an internal automatic speech recognition system for the national emergency call center, we have created the Romanian Speech Alert Corpus, a new Romanian speech corpus that was obtained by crowd-sourcing the reading of sentences in our institution. This paper describes the data acquisition process, several statistics about the resulted corpus and the algorithm we employed for removing the inadequate recordings.