利用机器学习进行匿名语音信号分割和分类

International Journal of Advanced Research in Science, Communication and Technology Pub Date : 2024-05-24 DOI:10.48175/ijarsct-18411

V. Naveen, Dr. S. Nagasundaram

{"title":"利用机器学习进行匿名语音信号分割和分类","authors":"V. Naveen, Dr. S. Nagasundaram","doi":"10.48175/ijarsct-18411","DOIUrl":null,"url":null,"abstract":"Stuttering or Stammering is a speech defect within which sounds, syllables, or words are rehashed or delayed, disrupting the traditional flow of speech. Stuttering can make it hard to speak with other individuals, which regularly have an effect on an individual's quality of life. Automatic Speech Recognition (ASR) system is a technology that converts audio speech signal into corresponding text. Presently ASR systems play a major role in controlling or providing inputs to the various applications. Such an ASR system and Machine Translation Application suffers a lot due to stuttering (speech dysfluency). Dysfluencies will affect the phrase consciousness accuracy of an ASR, with the aid of increasing word addition, substitution and dismissal rates. In this work we focused on detecting and removing the prolongation, silent pauses and repetition to generate proper text sequence for the given stuttered speech signal. The stuttered speech recognition consists of two stages namely classification using ANN and testing in ASR. The major phases of classification system are Re-sampling, Segmentation, Pre Emphasis, Epoch Extraction and Classification. The current work is carried out in UCLASS Stuttering dataset using MATLAB with 4% to 6% increase in accuracy by ANN.","PeriodicalId":472960,"journal":{"name":"International Journal of Advanced Research in Science, Communication and Technology","volume":"23 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stameering Speech Signal Segmentation and Classification using Machine Learning\",\"authors\":\"V. Naveen, Dr. S. Nagasundaram\",\"doi\":\"10.48175/ijarsct-18411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stuttering or Stammering is a speech defect within which sounds, syllables, or words are rehashed or delayed, disrupting the traditional flow of speech. Stuttering can make it hard to speak with other individuals, which regularly have an effect on an individual's quality of life. Automatic Speech Recognition (ASR) system is a technology that converts audio speech signal into corresponding text. Presently ASR systems play a major role in controlling or providing inputs to the various applications. Such an ASR system and Machine Translation Application suffers a lot due to stuttering (speech dysfluency). Dysfluencies will affect the phrase consciousness accuracy of an ASR, with the aid of increasing word addition, substitution and dismissal rates. In this work we focused on detecting and removing the prolongation, silent pauses and repetition to generate proper text sequence for the given stuttered speech signal. The stuttered speech recognition consists of two stages namely classification using ANN and testing in ASR. The major phases of classification system are Re-sampling, Segmentation, Pre Emphasis, Epoch Extraction and Classification. The current work is carried out in UCLASS Stuttering dataset using MATLAB with 4% to 6% increase in accuracy by ANN.\",\"PeriodicalId\":472960,\"journal\":{\"name\":\"International Journal of Advanced Research in Science, Communication and Technology\",\"volume\":\"23 12\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Advanced Research in Science, Communication and Technology\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.48175/ijarsct-18411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Research in Science, Communication and Technology","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.48175/ijarsct-18411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

口吃或结巴是一种语言缺陷，在这种缺陷中，声音、音节或单词被重复或延迟，扰乱了传统的说话流程。口吃会使人难以与他人交谈，从而经常影响个人的生活质量。自动语音识别（ASR）系统是一种将音频语音信号转换成相应文本的技术。目前，ASR 系统在控制各种应用程序或为其提供输入方面发挥着重要作用。这样的 ASR 系统和机器翻译应用程序由于口吃（语音不流畅）而受到很大影响。流畅性障碍会影响自动识别系统的短语意识准确性，增加单词的添加、替换和删除率。在这项工作中，我们的重点是检测和消除延长、无声停顿和重复，从而为给定的口吃语音信号生成正确的文本序列。结巴语音识别包括两个阶段，即使用 ANN 进行分类和在 ASR 中进行测试。分类系统的主要阶段包括重新采样、分段、预加重、时序提取和分类。目前的工作是使用 MATLAB 在 UCLASS 口吃数据集上进行的，ANN 的准确率提高了 4% 至 6%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Stameering Speech Signal Segmentation and Classification using Machine Learning

Stuttering or Stammering is a speech defect within which sounds, syllables, or words are rehashed or delayed, disrupting the traditional flow of speech. Stuttering can make it hard to speak with other individuals, which regularly have an effect on an individual's quality of life. Automatic Speech Recognition (ASR) system is a technology that converts audio speech signal into corresponding text. Presently ASR systems play a major role in controlling or providing inputs to the various applications. Such an ASR system and Machine Translation Application suffers a lot due to stuttering (speech dysfluency). Dysfluencies will affect the phrase consciousness accuracy of an ASR, with the aid of increasing word addition, substitution and dismissal rates. In this work we focused on detecting and removing the prolongation, silent pauses and repetition to generate proper text sequence for the given stuttered speech signal. The stuttered speech recognition consists of two stages namely classification using ANN and testing in ASR. The major phases of classification system are Re-sampling, Segmentation, Pre Emphasis, Epoch Extraction and Classification. The current work is carried out in UCLASS Stuttering dataset using MATLAB with 4% to 6% increase in accuracy by ANN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Advanced Research in Science, Communication and Technology

自引率

0.00%

发文量