Patrick Silva, Nelson Neto, A. Klautau, Andre Gustavo Adami, I. Trancoso
{"title":"Speech Recognition for Brazilian Portuguese using the Spoltech and OGI-22 Corpora","authors":"Patrick Silva, Nelson Neto, A. Klautau, Andre Gustavo Adami, I. Trancoso","doi":"10.14209/sbrt.2008.42724","DOIUrl":null,"url":null,"abstract":"Speech processing is a data-driven technology that relies on public corpora and associated resources. In contrast to languages such as English, there are few resources for Brazilian Portuguese (BP). This work describes efforts toward decreasing such gap and presents systems for speech recognition in BP using two public corpora: Spoltech and OGI-22. The following resources are made available: ATK and HTK scripts, pronunciation dictionary, language and acoustic models. The work discusses thebaselineresults obtained with these resources. Keywords— Speech recognition, Brazilian Portuguese, HMMs, pronunciation dictionary.","PeriodicalId":340055,"journal":{"name":"Anais do XXVI Simpósio Brasileiro de Telecomunicações","volume":"360 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do XXVI Simpósio Brasileiro de Telecomunicações","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14209/sbrt.2008.42724","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Speech processing is a data-driven technology that relies on public corpora and associated resources. In contrast to languages such as English, there are few resources for Brazilian Portuguese (BP). This work describes efforts toward decreasing such gap and presents systems for speech recognition in BP using two public corpora: Spoltech and OGI-22. The following resources are made available: ATK and HTK scripts, pronunciation dictionary, language and acoustic models. The work discusses thebaselineresults obtained with these resources. Keywords— Speech recognition, Brazilian Portuguese, HMMs, pronunciation dictionary.