Recognition and Classification of Pauses in Stuttered Speech Using Acoustic Features

Fathima Afroz, S. Koolagudi
{"title":"Recognition and Classification of Pauses in Stuttered Speech Using Acoustic Features","authors":"Fathima Afroz, S. Koolagudi","doi":"10.1109/SPIN.2019.8711569","DOIUrl":null,"url":null,"abstract":"Pauses plays an essential role in speech activities. Normally it helps the listener by creating a time and space to decode and interpret the message of a speaker. But in case of stuttering pauses disturbs the normal flow of speech. The uncontrolled, frequent and unplanned occurance of pasuses leads to slow speaking rate, results in broken words and increases the severity level of stuttering. Hence pauses and stuttering has a close relationship. Pauses are considered as one of the important pattern in diagnoisis and treatment of stuttering. In this work, an attempt has been made for the identification of inaudible (Silent or Unfilled) pauses from stuttered speech. The attributes like duration, frequency, position and distribution of pauses during speech tasks are measured and quantified. UCLASS stuttered speech corpus is considered for the analysis. Automatic blind segmentation approach is adopted to segment the speech signal into voice and unvoiced regions using dynamic threshold set based on energy and zero crossing rate (ZCR). 4 th formant frequencies are analysed to identify intra-morphic (unfilled) pauses present within voiced regions. The duratiion of intra-morphic pauses are analysed for stuttred speech and normal speech. It is observed that the duration of normal intra-morphic pause ranges from 150 ms-250 ms and inter-morphic pauses are <=250 ms and short pause have duration ranges from 50 ms-150 ms. Whereas in stuttering short intra-morphic pauses ranges from 10 ms to 50 ms, long pauses ranges from 250 ms to 1 or 2 seconds. Segmentation of the intra-morphic pauses is observed to acheive an accuracy of 98%. Results are compared and validated with manual method.","PeriodicalId":344030,"journal":{"name":"2019 6th International Conference on Signal Processing and Integrated Networks (SPIN)","volume":"6 32","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Signal Processing and Integrated Networks (SPIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPIN.2019.8711569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Pauses plays an essential role in speech activities. Normally it helps the listener by creating a time and space to decode and interpret the message of a speaker. But in case of stuttering pauses disturbs the normal flow of speech. The uncontrolled, frequent and unplanned occurance of pasuses leads to slow speaking rate, results in broken words and increases the severity level of stuttering. Hence pauses and stuttering has a close relationship. Pauses are considered as one of the important pattern in diagnoisis and treatment of stuttering. In this work, an attempt has been made for the identification of inaudible (Silent or Unfilled) pauses from stuttered speech. The attributes like duration, frequency, position and distribution of pauses during speech tasks are measured and quantified. UCLASS stuttered speech corpus is considered for the analysis. Automatic blind segmentation approach is adopted to segment the speech signal into voice and unvoiced regions using dynamic threshold set based on energy and zero crossing rate (ZCR). 4 th formant frequencies are analysed to identify intra-morphic (unfilled) pauses present within voiced regions. The duratiion of intra-morphic pauses are analysed for stuttred speech and normal speech. It is observed that the duration of normal intra-morphic pause ranges from 150 ms-250 ms and inter-morphic pauses are <=250 ms and short pause have duration ranges from 50 ms-150 ms. Whereas in stuttering short intra-morphic pauses ranges from 10 ms to 50 ms, long pauses ranges from 250 ms to 1 or 2 seconds. Segmentation of the intra-morphic pauses is observed to acheive an accuracy of 98%. Results are compared and validated with manual method.
利用声学特征识别和分类口吃语音中的停顿
停顿在言语活动中起着至关重要的作用。通常,它通过创造时间和空间来帮助听者解码和解释说话者的信息。但在口吃的情况下,停顿会扰乱正常的语言流。不受控制的、频繁的、计划外的停顿会导致语速减慢,导致言语破碎,增加口吃的严重程度。因此,停顿和口吃有着密切的关系。停顿被认为是口吃诊断和治疗的重要模式之一。在这项工作中,我们尝试从口吃的言语中识别听不清(沉默或未填充)的停顿。对语音任务中停顿的持续时间、频率、位置和分布等属性进行测量和量化。使用UCLASS口吃语料库进行分析。采用基于能量和过零率(ZCR)设定的动态阈值,将语音信号自动盲分割为浊音和浊音区域。分析第四共振峰频率,以识别浊音区域内存在的词形内(未填充)停顿。分析了正常言语和结巴言语中语态内停顿的持续时间。正常形态内暂停的持续时间为150 ms-250 ms,形态间暂停的持续时间<=250 ms,短暂停的持续时间为50 ms-150 ms。而在口吃中,短的形态内停顿从10毫秒到50毫秒不等,长停顿从250毫秒到1或2秒不等。对词形内停顿的分割准确率达到98%。结果与手工方法进行了比较和验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信