{"title":"整合不同优化方法对不同阿拉伯语和英语语音命令进行分类","authors":"Karim Dabbabi, Abdelkarim Mars","doi":"10.1109/IC_ASET58101.2023.10151158","DOIUrl":null,"url":null,"abstract":"Several hyperparameters represent major sensitive factors for deep learning models. For this, different hyperparameter optimization approaches are proposed to accelerate the convergence towards the optimal configurations and to support the calculations during the long learning times, and thus to give improved performance. These approaches include Bayesian Optimization (BO), Hyperband and Tree Parzen Estimator (TPE), and they are proposed for optimization task in this paper. Also, Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) are suggested as classifiers and Mel frequency cepstrum coefficients (MFCC) and Mel as features. Experiments showed that the best results in terms of evaluated performances (Precision = 94.96%, Recall = 94.85%, F1 = 94.85%) were obtained with the combination of LSTM and MFCC (LSTM-MFCC) with BO on English voice command database compared to those obtained with other combinations of features and classifiers with different optimization approaches. Moreover, there is evidence that BO converged faster than TPE and HB, and converged to better configurations.","PeriodicalId":272261,"journal":{"name":"2023 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integration of Different Optimization Approaches for the Classification of Different Arabic and English Voice Commands\",\"authors\":\"Karim Dabbabi, Abdelkarim Mars\",\"doi\":\"10.1109/IC_ASET58101.2023.10151158\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Several hyperparameters represent major sensitive factors for deep learning models. For this, different hyperparameter optimization approaches are proposed to accelerate the convergence towards the optimal configurations and to support the calculations during the long learning times, and thus to give improved performance. These approaches include Bayesian Optimization (BO), Hyperband and Tree Parzen Estimator (TPE), and they are proposed for optimization task in this paper. Also, Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) are suggested as classifiers and Mel frequency cepstrum coefficients (MFCC) and Mel as features. Experiments showed that the best results in terms of evaluated performances (Precision = 94.96%, Recall = 94.85%, F1 = 94.85%) were obtained with the combination of LSTM and MFCC (LSTM-MFCC) with BO on English voice command database compared to those obtained with other combinations of features and classifiers with different optimization approaches. Moreover, there is evidence that BO converged faster than TPE and HB, and converged to better configurations.\",\"PeriodicalId\":272261,\"journal\":{\"name\":\"2023 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC_ASET58101.2023.10151158\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC_ASET58101.2023.10151158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
几个超参数代表了深度学习模型的主要敏感因素。为此,提出了不同的超参数优化方法,以加速向最优配置的收敛,并支持长学习时间的计算,从而提高性能。这些方法包括贝叶斯优化(BO)、超带估计(Hyperband)和树Parzen估计(TPE),本文提出了它们用于优化任务。此外,还提出了长短期记忆(LSTM)和卷积神经网络(CNN)作为分类器,Mel频率倒谱系数(MFCC)和Mel作为特征。实验结果表明,LSTM与MFCC (LSTM-MFCC)结合BO在英语语音命令数据库上的评价性能(Precision = 94.96%, Recall = 94.85%, F1 = 94.85%)优于其他不同优化方法下的特征与分类器组合。此外,有证据表明,BO的收敛速度比TPE和HB更快,并且收敛到更好的配置。
Integration of Different Optimization Approaches for the Classification of Different Arabic and English Voice Commands
Several hyperparameters represent major sensitive factors for deep learning models. For this, different hyperparameter optimization approaches are proposed to accelerate the convergence towards the optimal configurations and to support the calculations during the long learning times, and thus to give improved performance. These approaches include Bayesian Optimization (BO), Hyperband and Tree Parzen Estimator (TPE), and they are proposed for optimization task in this paper. Also, Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) are suggested as classifiers and Mel frequency cepstrum coefficients (MFCC) and Mel as features. Experiments showed that the best results in terms of evaluated performances (Precision = 94.96%, Recall = 94.85%, F1 = 94.85%) were obtained with the combination of LSTM and MFCC (LSTM-MFCC) with BO on English voice command database compared to those obtained with other combinations of features and classifiers with different optimization approaches. Moreover, there is evidence that BO converged faster than TPE and HB, and converged to better configurations.