Feasibility Study of Parkinson's Speech Disorder Evaluation With Pre-Trained Deep Learning Model for Speech-to-Text Analysis.

Q3 Medicine
Korean Journal of Neurotrauma Pub Date : 2024-09-23 eCollection Date: 2024-09-01 DOI:10.13004/kjnt.2024.20.e30
Kwang Hyeon Kim, Byung-Jou Lee, Hae-Won Koo
{"title":"Feasibility Study of Parkinson's Speech Disorder Evaluation With Pre-Trained Deep Learning Model for Speech-to-Text Analysis.","authors":"Kwang Hyeon Kim, Byung-Jou Lee, Hae-Won Koo","doi":"10.13004/kjnt.2024.20.e30","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study investigates the feasibility of employing a pre-trained deep learning wave-to-vec model for speech-to-text analysis in individuals with speech disorders arising from Parkinson's disease (PD).</p><p><strong>Methods: </strong>A publicly available dataset containing speech recordings including the Hoehn and Yahr (H&Y) staging, Movement Disorder Society Unified Parkinson's Disease Rating Scale (UPDRS) Part I, UPDRS Part II scores, and gender information from both healthy controls (HC) and those diagnosed with PD was utilized. Employing the Wav2Vec model, a speech-to-text analysis method was implemented on PD patient data. Tasks conducted included word letter classification, word match probability assessment, and analysis of speech waveform characteristics as provided by the model's output.</p><p><strong>Results: </strong>For the dataset comprising 20 cases, among individuals with PD, the H&Y score averaged 2.50±0.67, the UPDRS II-part 5 score averaged 0.70±1.00, and the UPDRS III-part 18 score averaged 0.80±0.98. Additionally, the number of words derived from decoded text subsequent to speech recognition was evaluated, resulting in mean values of 299.10±16.79 and 259.80±93.39 for the HC and PD groups, respectively. Furthermore, the calculated degree of agreement for all syllables was based on the speech process. The accuracy for the reading sentences was observed to be 0.31 and 0.10, respectively.</p><p><strong>Conclusion: </strong>This study aimed to demonstrate the effectiveness of wave-to-vec in enhancing speech-to-text analysis for patients with speech disorders. The findings could pave the way for the development of clinical tools for improved diagnosis, evaluation, and communication support for this population.</p>","PeriodicalId":36879,"journal":{"name":"Korean Journal of Neurotrauma","volume":"20 3","pages":"168-179"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11450341/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Korean Journal of Neurotrauma","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13004/kjnt.2024.20.e30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: This study investigates the feasibility of employing a pre-trained deep learning wave-to-vec model for speech-to-text analysis in individuals with speech disorders arising from Parkinson's disease (PD).

Methods: A publicly available dataset containing speech recordings including the Hoehn and Yahr (H&Y) staging, Movement Disorder Society Unified Parkinson's Disease Rating Scale (UPDRS) Part I, UPDRS Part II scores, and gender information from both healthy controls (HC) and those diagnosed with PD was utilized. Employing the Wav2Vec model, a speech-to-text analysis method was implemented on PD patient data. Tasks conducted included word letter classification, word match probability assessment, and analysis of speech waveform characteristics as provided by the model's output.

Results: For the dataset comprising 20 cases, among individuals with PD, the H&Y score averaged 2.50±0.67, the UPDRS II-part 5 score averaged 0.70±1.00, and the UPDRS III-part 18 score averaged 0.80±0.98. Additionally, the number of words derived from decoded text subsequent to speech recognition was evaluated, resulting in mean values of 299.10±16.79 and 259.80±93.39 for the HC and PD groups, respectively. Furthermore, the calculated degree of agreement for all syllables was based on the speech process. The accuracy for the reading sentences was observed to be 0.31 and 0.10, respectively.

Conclusion: This study aimed to demonstrate the effectiveness of wave-to-vec in enhancing speech-to-text analysis for patients with speech disorders. The findings could pave the way for the development of clinical tools for improved diagnosis, evaluation, and communication support for this population.

利用预训练的深度学习模型进行帕金森氏症语言障碍评估的可行性研究》,用于语音到文本分析。
研究目的本研究调查了在帕金森病(PD)引起的言语障碍患者中使用预训练深度学习波形-vec模型进行语音-文本分析的可行性:我们利用了一个公开可用的数据集,其中包含语音录音,包括健康对照组(HC)和被诊断为帕金森病患者的 Hoehn and Yahr(H&Y)分期、运动障碍协会统一帕金森病评分量表(UPDRS)第一部分、UPDRS 第二部分评分和性别信息。采用 Wav2Vec 模型,对帕金森病患者数据实施了语音到文本分析方法。分析任务包括单词字母分类、单词匹配概率评估以及分析模型输出提供的语音波形特征:结果:在由 20 个病例组成的数据集中,PD 患者的 H&Y 评分平均为 2.50±0.67,UPDRS II 第 5 部分评分平均为 0.70±1.00,UPDRS III 第 18 部分评分平均为 0.80±0.98。此外,还对语音识别后从解码文本中得出的单词数进行了评估,结果是HC组和PD组的平均值分别为(299.10±16.79)和(259.80±93.39)。此外,所有音节的一致度计算均基于语音过程。阅读句子的准确度分别为 0.31 和 0.10:本研究旨在证明 wave-to-vec 在增强言语障碍患者的语音到文本分析方面的有效性。研究结果可为开发临床工具铺平道路,以改善对这一人群的诊断、评估和交流支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.10
自引率
0.00%
发文量
41
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信