{"title":"使用卷积神经网络对干净和有噪声的语音样本进行说话人识别","authors":"Ali Muayad Jalil, F. S. Hasan, H. Alabbasi","doi":"10.1109/CAS47993.2019.9075461","DOIUrl":null,"url":null,"abstract":"Conventional speaker identification systems require features that are carefully designed to achieve high identification accuracy rates. With deep learning, these features are learned rather than specifically designed. The improvements of deep neural networks algorithms and techniques lead to an increase in using deep neural networks for speaker identification systems in favour of the conventional systems. In this paper, we use a convolutional neural network with Mel-spectrogram as an input for the identification purpose. The experiments are done on TIMIT dataset to evaluate the proposed CNN architecture and to compare with state-of-the-art systems for clean and noisy speech samples.","PeriodicalId":202291,"journal":{"name":"2019 First International Conference of Computer and Applied Sciences (CAS)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Speaker identification using convolutional neural network for clean and noisy speech samples\",\"authors\":\"Ali Muayad Jalil, F. S. Hasan, H. Alabbasi\",\"doi\":\"10.1109/CAS47993.2019.9075461\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Conventional speaker identification systems require features that are carefully designed to achieve high identification accuracy rates. With deep learning, these features are learned rather than specifically designed. The improvements of deep neural networks algorithms and techniques lead to an increase in using deep neural networks for speaker identification systems in favour of the conventional systems. In this paper, we use a convolutional neural network with Mel-spectrogram as an input for the identification purpose. The experiments are done on TIMIT dataset to evaluate the proposed CNN architecture and to compare with state-of-the-art systems for clean and noisy speech samples.\",\"PeriodicalId\":202291,\"journal\":{\"name\":\"2019 First International Conference of Computer and Applied Sciences (CAS)\",\"volume\":\"100 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 First International Conference of Computer and Applied Sciences (CAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAS47993.2019.9075461\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 First International Conference of Computer and Applied Sciences (CAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAS47993.2019.9075461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speaker identification using convolutional neural network for clean and noisy speech samples
Conventional speaker identification systems require features that are carefully designed to achieve high identification accuracy rates. With deep learning, these features are learned rather than specifically designed. The improvements of deep neural networks algorithms and techniques lead to an increase in using deep neural networks for speaker identification systems in favour of the conventional systems. In this paper, we use a convolutional neural network with Mel-spectrogram as an input for the identification purpose. The experiments are done on TIMIT dataset to evaluate the proposed CNN architecture and to compare with state-of-the-art systems for clean and noisy speech samples.