{"title":"Evaluative comparison of machine learning algorithms for stutter detection and classification","authors":"Ramitha V, Rhea Chainani, Saharsh Mehrotra, Sakshi Sah, Smita Mahajan","doi":"10.1016/j.mex.2024.103050","DOIUrl":null,"url":null,"abstract":"<div><div>Stuttering is a neuro-developmental speech disorder that interrupts the flow of speech due to involuntary pauses and sound repetitions. It has profound psychological impacts that affect social interactions and professional advancements. Automatically detecting stuttering events in speech recordings could assist speech therapists or speech pathologists track the fluency of people who stutter (PWS). It will also assist in the improvement of the existing speech recognition system for PWS. In this paper, the SEP-28k dataset is utilized to perform comparative analysis to assess the performance of various machine learning models in classifying the five dysfluency types namely Prolongation, Interjection, Word Repetition, Sound Repetition and Blocks.<ul><li><span>•</span><span><div>The study focuses on automatically detecting stuttering events in speech recordings to support speech therapists and improve speech recognition systems for people who stutter (PWS).</div></span></li><li><span>•</span><span><div>The SEP-28k dataset is used to perform a comparative analysis of different machine learning models.</div></span></li><li><span>•</span><span><div>The research examines the impact of key acoustic features on model accuracy while addressing challenges such as class imbalance.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"13 ","pages":"Article 103050"},"PeriodicalIF":1.6000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MethodsX","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215016124005016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Stuttering is a neuro-developmental speech disorder that interrupts the flow of speech due to involuntary pauses and sound repetitions. It has profound psychological impacts that affect social interactions and professional advancements. Automatically detecting stuttering events in speech recordings could assist speech therapists or speech pathologists track the fluency of people who stutter (PWS). It will also assist in the improvement of the existing speech recognition system for PWS. In this paper, the SEP-28k dataset is utilized to perform comparative analysis to assess the performance of various machine learning models in classifying the five dysfluency types namely Prolongation, Interjection, Word Repetition, Sound Repetition and Blocks.
•
The study focuses on automatically detecting stuttering events in speech recordings to support speech therapists and improve speech recognition systems for people who stutter (PWS).
•
The SEP-28k dataset is used to perform a comparative analysis of different machine learning models.
•
The research examines the impact of key acoustic features on model accuracy while addressing challenges such as class imbalance.