Mohamed Lichouri, Khaled Lounnas, R. Djeradi, A. Djeradi
{"title":"Performance of End-to-End vs Pipeline Spoken Language Understanding Models on Multilingual Synthetic Voice","authors":"Mohamed Lichouri, Khaled Lounnas, R. Djeradi, A. Djeradi","doi":"10.1109/ICAASE56196.2022.9931594","DOIUrl":null,"url":null,"abstract":"This work conducts a comparative investigation of two architectures in the domain of Spoken Language Understanding (SLU), which were evaluated on a synthesized corpus of three languages: Modern Standard Arabic (MSA), French, and English. The first architecture employs a simple SLU system based on classical machine learning algorithms (E2E SLU), whereas the second architecture (Pipeline SLU) merges the textual output of a speech recognition system (ASR) with that of a textual classification system by transmitting it to a ”Natural Language Understanding” (NLU) model, allowing us to compare the predictions of the two systems. The obtained results were encouraging where we found that the Pipeline approach has given us better results than the E2E approach","PeriodicalId":206411,"journal":{"name":"2022 International Conference on Advanced Aspects of Software Engineering (ICAASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Advanced Aspects of Software Engineering (ICAASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAASE56196.2022.9931594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This work conducts a comparative investigation of two architectures in the domain of Spoken Language Understanding (SLU), which were evaluated on a synthesized corpus of three languages: Modern Standard Arabic (MSA), French, and English. The first architecture employs a simple SLU system based on classical machine learning algorithms (E2E SLU), whereas the second architecture (Pipeline SLU) merges the textual output of a speech recognition system (ASR) with that of a textual classification system by transmitting it to a ”Natural Language Understanding” (NLU) model, allowing us to compare the predictions of the two systems. The obtained results were encouraging where we found that the Pipeline approach has given us better results than the E2E approach