Which French speech recognition system for assistant robots?

2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET) Pub Date : 2022-03-03 DOI:10.1109/IRASET52964.2022.9737976

Wiam Fadel, Imane Araf, T. Bouchentouf, Pierre-André Buvet, F. Bourzeix, Omar Bourja

{"title":"Which French speech recognition system for assistant robots?","authors":"Wiam Fadel, Imane Araf, T. Bouchentouf, Pierre-André Buvet, F. Bourzeix, Omar Bourja","doi":"10.1109/IRASET52964.2022.9737976","DOIUrl":null,"url":null,"abstract":"Artificial intelligence-based speech recognition systems are already available and capable of recognizing the French language. Still, it is quite time-consuming to compare which one will be effective for an assistant robot. The study aims to select the best French-language speech recognition system with the least error in a real environment. In this paper, we present related works on how an Automatic Speech Recognition (ASR) system works, the models used by each of its components, several open-source French datasets, and the frequently used evaluation techniques. Next, we compare deep learning-based speech recognition APIs and pre-trained models for French on two different datasets using the Word Error Rate (WER) metric. The experimental results reveal that Google's Speech-to-Text API outperforms the other systems, namely VOSK API, Wav2vec 2.0, QuartzNet, and Speech Brain's Convolutional, Recurrent, and Fully-connected Networks (CRDNN) model.","PeriodicalId":377115,"journal":{"name":"2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","volume":"77 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRASET52964.2022.9737976","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Artificial intelligence-based speech recognition systems are already available and capable of recognizing the French language. Still, it is quite time-consuming to compare which one will be effective for an assistant robot. The study aims to select the best French-language speech recognition system with the least error in a real environment. In this paper, we present related works on how an Automatic Speech Recognition (ASR) system works, the models used by each of its components, several open-source French datasets, and the frequently used evaluation techniques. Next, we compare deep learning-based speech recognition APIs and pre-trained models for French on two different datasets using the Word Error Rate (WER) metric. The experimental results reveal that Google's Speech-to-Text API outperforms the other systems, namely VOSK API, Wav2vec 2.0, QuartzNet, and Speech Brain's Convolutional, Recurrent, and Fully-connected Networks (CRDNN) model.

查看原文本刊更多论文

哪个法语语音识别系统适用于助理机器人?

基于人工智能的语音识别系统已经可用，并且能够识别法语。然而，比较哪一种方法对助理机器人有效是相当耗时的。本研究的目的是在真实环境中选择误差最小的最佳法语语音识别系统。在本文中，我们介绍了自动语音识别(ASR)系统如何工作的相关工作，其每个组件使用的模型，几个开源法语数据集以及常用的评估技术。接下来，我们使用单词错误率(WER)指标在两个不同的数据集上比较基于深度学习的语音识别api和预训练的法语模型。实验结果表明，谷歌的语音到文本API优于其他系统，即VOSK API, Wav2vec 2.0, QuartzNet和Speech Brain的卷积，循环和全连接网络(CRDNN)模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)

自引率

0.00%

发文量