基于神经网络的语言辅助工具,提高成人口吃者的流利程度

Sharan Narasimhan, R. Rao
{"title":"基于神经网络的语言辅助工具,提高成人口吃者的流利程度","authors":"Sharan Narasimhan, R. Rao","doi":"10.1109/DISCOVER47552.2019.9008034","DOIUrl":null,"url":null,"abstract":"Millions of adults suffer from a condition called stuttering or stammering. The authors propose the use of a Speech Assistance tool, which helps stuttered speakers achieve higher fluency and a slower rate of speech. The fluency is achieved by adhering to the proposed fluency enhancing technique. The fluency enhancing technique (FET) is inspired by fluency shaping methods and requires the speaker to use a rhythmic method called gentle onset with words and a slower rate of speech. In the training mode, the Speech assistance tool trains an artificial neural network to identify the speaker's FET based words vs. the non-FET or normal words. The audio features are represented using Mel-Frequency Cepstral Coefficients (MFCC), which captures the prosody of the spoken words. In the real-life conversation mode, the speaker gets visual cues to ensure that the speaker adheres to the proposed FET technique. The tool also performs disfluency analysis and provides feedback to users, in terms of FET words ratio, the disfluency score for a hundred words, and the speech rate. The tool also logs the disfluencies periodically to help the speaker track his/her fluency over time. The DTW analysis of MFCC features proven that there is a clear difference in the prosody of the FET and non-FET words. While using the proposed FET based tool, the fluency of the speaker increases and slower speech rate is also achieved. The Speech assistance tool can be used along with Cognitive Behavior Therapy to help rehabilitate adults who stutter.","PeriodicalId":274260,"journal":{"name":"2019 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Neural Network based Speech Assistance tool to enhance the fluency of adults who stutter\",\"authors\":\"Sharan Narasimhan, R. Rao\",\"doi\":\"10.1109/DISCOVER47552.2019.9008034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Millions of adults suffer from a condition called stuttering or stammering. The authors propose the use of a Speech Assistance tool, which helps stuttered speakers achieve higher fluency and a slower rate of speech. The fluency is achieved by adhering to the proposed fluency enhancing technique. The fluency enhancing technique (FET) is inspired by fluency shaping methods and requires the speaker to use a rhythmic method called gentle onset with words and a slower rate of speech. In the training mode, the Speech assistance tool trains an artificial neural network to identify the speaker's FET based words vs. the non-FET or normal words. The audio features are represented using Mel-Frequency Cepstral Coefficients (MFCC), which captures the prosody of the spoken words. In the real-life conversation mode, the speaker gets visual cues to ensure that the speaker adheres to the proposed FET technique. The tool also performs disfluency analysis and provides feedback to users, in terms of FET words ratio, the disfluency score for a hundred words, and the speech rate. The tool also logs the disfluencies periodically to help the speaker track his/her fluency over time. The DTW analysis of MFCC features proven that there is a clear difference in the prosody of the FET and non-FET words. While using the proposed FET based tool, the fluency of the speaker increases and slower speech rate is also achieved. The Speech assistance tool can be used along with Cognitive Behavior Therapy to help rehabilitate adults who stutter.\",\"PeriodicalId\":274260,\"journal\":{\"name\":\"2019 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DISCOVER47552.2019.9008034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISCOVER47552.2019.9008034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

数以百万计的成年人患有口吃或结巴。作者建议使用语音辅助工具,帮助口吃者达到更高的流利度和更慢的语速。流利是通过坚持提出的流利增强技巧来实现的。流利度增强技术(FET)的灵感来自流利度塑造方法,要求说话者使用一种有节奏的方法,称为温和的起音和较慢的语速。在训练模式下,语音辅助工具训练一个人工神经网络来识别说话者基于FET的单词与非FET或正常单词。音频特征用Mel-Frequency Cepstral Coefficients (MFCC)来表示,MFCC捕捉口语单词的韵律。在现实生活中的对话模式中,说话者获得视觉线索,以确保说话者坚持所提出的场效应晶体管技术。该工具还可以执行不流畅性分析并向用户提供反馈,包括场效应效应词比、百词不流畅得分和语音率。该工具还定期记录不流利的情况,以帮助说话者跟踪他/她的流利程度。对MFCC特征的DTW分析表明,FET词与非FET词在韵律上存在明显差异。当使用基于FET的工具时,说话者的流畅性提高,同时也实现了较慢的语音速率。言语辅助工具可以与认知行为疗法一起使用,帮助口吃的成年人康复。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Neural Network based Speech Assistance tool to enhance the fluency of adults who stutter
Millions of adults suffer from a condition called stuttering or stammering. The authors propose the use of a Speech Assistance tool, which helps stuttered speakers achieve higher fluency and a slower rate of speech. The fluency is achieved by adhering to the proposed fluency enhancing technique. The fluency enhancing technique (FET) is inspired by fluency shaping methods and requires the speaker to use a rhythmic method called gentle onset with words and a slower rate of speech. In the training mode, the Speech assistance tool trains an artificial neural network to identify the speaker's FET based words vs. the non-FET or normal words. The audio features are represented using Mel-Frequency Cepstral Coefficients (MFCC), which captures the prosody of the spoken words. In the real-life conversation mode, the speaker gets visual cues to ensure that the speaker adheres to the proposed FET technique. The tool also performs disfluency analysis and provides feedback to users, in terms of FET words ratio, the disfluency score for a hundred words, and the speech rate. The tool also logs the disfluencies periodically to help the speaker track his/her fluency over time. The DTW analysis of MFCC features proven that there is a clear difference in the prosody of the FET and non-FET words. While using the proposed FET based tool, the fluency of the speaker increases and slower speech rate is also achieved. The Speech assistance tool can be used along with Cognitive Behavior Therapy to help rehabilitate adults who stutter.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信