{"title":"基于深度学习的病理语音分类","authors":"Shuvendu Roy, Md. Ijaj Sayim, M. Akhand","doi":"10.1109/ICASERT.2019.8934514","DOIUrl":null,"url":null,"abstract":"Voice classification task deals with sequential data. This is well known that this type of data is well processed by a recurrent neural network. In this work, we showed that in case of longer sequence convolutional neural network can give better accuracy. Whereas the recurrent network suffers from vanishing gradient problem even with a complex model like Long Short-Term Memory(LSTM). To illustrate the method we used pathological voice detection task. It is a type of problem in human voice caused by the internal defect in the throat and very hard to detect. In this work, we experimented with low dimension feature to compare both models rather than focusing on improving the overall accuracy.","PeriodicalId":6613,"journal":{"name":"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)","volume":"123 2 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Pathological Voice Classification Using Deep Learning\",\"authors\":\"Shuvendu Roy, Md. Ijaj Sayim, M. Akhand\",\"doi\":\"10.1109/ICASERT.2019.8934514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Voice classification task deals with sequential data. This is well known that this type of data is well processed by a recurrent neural network. In this work, we showed that in case of longer sequence convolutional neural network can give better accuracy. Whereas the recurrent network suffers from vanishing gradient problem even with a complex model like Long Short-Term Memory(LSTM). To illustrate the method we used pathological voice detection task. It is a type of problem in human voice caused by the internal defect in the throat and very hard to detect. In this work, we experimented with low dimension feature to compare both models rather than focusing on improving the overall accuracy.\",\"PeriodicalId\":6613,\"journal\":{\"name\":\"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)\",\"volume\":\"123 2 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASERT.2019.8934514\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASERT.2019.8934514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Pathological Voice Classification Using Deep Learning
Voice classification task deals with sequential data. This is well known that this type of data is well processed by a recurrent neural network. In this work, we showed that in case of longer sequence convolutional neural network can give better accuracy. Whereas the recurrent network suffers from vanishing gradient problem even with a complex model like Long Short-Term Memory(LSTM). To illustrate the method we used pathological voice detection task. It is a type of problem in human voice caused by the internal defect in the throat and very hard to detect. In this work, we experimented with low dimension feature to compare both models rather than focusing on improving the overall accuracy.