{"title":"自动语音识别中模型能力对训练和测试环境差异性建模的影响","authors":"Anwar Tantawy, D. O'Shaughnessy","doi":"10.1109/AIKE55402.2022.00016","DOIUrl":null,"url":null,"abstract":"Automatic Speech Recognition (ASR) applications have increased greatly during the last decade due to the emergence of new devices and home automation hardware that can benefit a lot from allowing users to interact hands free, such as smart watches, earbuds, portable translators and home assistants. ASR implemented for these applications inevitably suffers from performance degradation in real life scenarios. Most ASR systems expect that the working environments are similar to the training environment, which is often not the case, especially for new applications with limited data availability. This study is concerned with experimentally showing the effect of variations in the environment on different ASR models and the capacity of different models to improve performance when provided with training data similar to the testing environment. The experiments were conducted using discrepant training and testing datasets with varying levels of discrepancy. These tests can help researchers for novel applications identify suitable models according to the anticipated variabilities between the training data used and the real-life application.","PeriodicalId":441077,"journal":{"name":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Effects of Model Capacity in Modelling Variability between Training and Testing Environments for Automatic Speech Recognition\",\"authors\":\"Anwar Tantawy, D. O'Shaughnessy\",\"doi\":\"10.1109/AIKE55402.2022.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic Speech Recognition (ASR) applications have increased greatly during the last decade due to the emergence of new devices and home automation hardware that can benefit a lot from allowing users to interact hands free, such as smart watches, earbuds, portable translators and home assistants. ASR implemented for these applications inevitably suffers from performance degradation in real life scenarios. Most ASR systems expect that the working environments are similar to the training environment, which is often not the case, especially for new applications with limited data availability. This study is concerned with experimentally showing the effect of variations in the environment on different ASR models and the capacity of different models to improve performance when provided with training data similar to the testing environment. The experiments were conducted using discrepant training and testing datasets with varying levels of discrepancy. These tests can help researchers for novel applications identify suitable models according to the anticipated variabilities between the training data used and the real-life application.\",\"PeriodicalId\":441077,\"journal\":{\"name\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIKE55402.2022.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE55402.2022.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Effects of Model Capacity in Modelling Variability between Training and Testing Environments for Automatic Speech Recognition
Automatic Speech Recognition (ASR) applications have increased greatly during the last decade due to the emergence of new devices and home automation hardware that can benefit a lot from allowing users to interact hands free, such as smart watches, earbuds, portable translators and home assistants. ASR implemented for these applications inevitably suffers from performance degradation in real life scenarios. Most ASR systems expect that the working environments are similar to the training environment, which is often not the case, especially for new applications with limited data availability. This study is concerned with experimentally showing the effect of variations in the environment on different ASR models and the capacity of different models to improve performance when provided with training data similar to the testing environment. The experiments were conducted using discrepant training and testing datasets with varying levels of discrepancy. These tests can help researchers for novel applications identify suitable models according to the anticipated variabilities between the training data used and the real-life application.