{"title":"A CNN-Based Automated Stuttering Identification System","authors":"YashKiran Prabhu, Naeem Seliya","doi":"10.1109/ICMLA55696.2022.00247","DOIUrl":null,"url":null,"abstract":"Stuttering can affect quality of life, resulting in poor social, emotional, and mental health status. Stuttering is diagnosed and managed by speech language pathologists, who are scarce in developing countries. We propose a novel CNN-based Automated Stuttering Identification System (ASIS) to help speech pathologists autonomously diagnose, classify, and log fluency disorders (blocks, prolongations, sound repetitions, word repetitions, and interjections), and monitor patient’s fluency progress over time. A baseline CNN model was created in Tensorflow/Keras and trained and tested using the Sep-28k dataset, an annotated stuttering database of 28,000 3-second clips. We built individual models for each disfluency label and measured accuracy, precision, recall, and F1 measure. The models were built five times, and the averages were taken of each metric. Three different training-validation-test splits were used: 80-10-10, 70-20-10, and 60-20-20. The models performed very well on the public dataset, exceeding the accuracy and F1 measure of other classifiers. The proposed ASIS can help speech pathologists improve the quality of life of stutterers especially in developing countries immensely, and thus it can make a significant difference for millions around the world.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"488 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Stuttering can affect quality of life, resulting in poor social, emotional, and mental health status. Stuttering is diagnosed and managed by speech language pathologists, who are scarce in developing countries. We propose a novel CNN-based Automated Stuttering Identification System (ASIS) to help speech pathologists autonomously diagnose, classify, and log fluency disorders (blocks, prolongations, sound repetitions, word repetitions, and interjections), and monitor patient’s fluency progress over time. A baseline CNN model was created in Tensorflow/Keras and trained and tested using the Sep-28k dataset, an annotated stuttering database of 28,000 3-second clips. We built individual models for each disfluency label and measured accuracy, precision, recall, and F1 measure. The models were built five times, and the averages were taken of each metric. Three different training-validation-test splits were used: 80-10-10, 70-20-10, and 60-20-20. The models performed very well on the public dataset, exceeding the accuracy and F1 measure of other classifiers. The proposed ASIS can help speech pathologists improve the quality of life of stutterers especially in developing countries immensely, and thus it can make a significant difference for millions around the world.