Rodrigo Fay Verqara, Paulo Henrique dos Santos, Guilherme Fay Verqara, Fábio L. L. Mendonça, C. E. L. Veiga, B. Praciano, Daniel Alves da Silva, Rafael Timóteo de Sousa Júnior
{"title":"A study of automatic speech recognition in Portuguese by the Brazilian General Attorney of the Union","authors":"Rodrigo Fay Verqara, Paulo Henrique dos Santos, Guilherme Fay Verqara, Fábio L. L. Mendonça, C. E. L. Veiga, B. Praciano, Daniel Alves da Silva, Rafael Timóteo de Sousa Júnior","doi":"10.1109/ICDMW58026.2022.00038","DOIUrl":null,"url":null,"abstract":"This article presents a study of an automatic speech recognition system in Portuguese applied to videos by the General Attorney of the Union of Brazil. As they are confidential videos, using proprietary software from large companies is not allowed for security reasons. Thus, constructing an artificial intelligence model capable of performing automatic speech recognition in Portuguese in the judicial context and making this model available for large-scale inference is critical to maintaining data security. For this purpose, a dataset in Brazilian Portuguese was used by a combination of 3 datasets already built. The system used TDNN Jasper and QuartzNet architectures for network training, obtaining promising preliminary results, having a word error rate (WER) of 56% without using a linguistic model.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This article presents a study of an automatic speech recognition system in Portuguese applied to videos by the General Attorney of the Union of Brazil. As they are confidential videos, using proprietary software from large companies is not allowed for security reasons. Thus, constructing an artificial intelligence model capable of performing automatic speech recognition in Portuguese in the judicial context and making this model available for large-scale inference is critical to maintaining data security. For this purpose, a dataset in Brazilian Portuguese was used by a combination of 3 datasets already built. The system used TDNN Jasper and QuartzNet architectures for network training, obtaining promising preliminary results, having a word error rate (WER) of 56% without using a linguistic model.