Machine Learning Classifiers for Speech Detection

2022 IEEE VLSI Device Circuit and System (VLSI DCS) Pub Date : 2022-02-26 DOI:10.1109/VLSIDCS53788.2022.9811452

Dasari Lakshmi Prasanna, S. Tripathi

{"title":"Machine Learning Classifiers for Speech Detection","authors":"Dasari Lakshmi Prasanna, S. Tripathi","doi":"10.1109/VLSIDCS53788.2022.9811452","DOIUrl":null,"url":null,"abstract":"Human-machine interaction is everywhere as technologies affecting audio, natural language processing, and machine vision evolve for artificial Intelligence (AI). Speech detection based on AI techniques can be used in devices or systems driven by voice and automatic speech recognition for security purposes or detecting specific sounds like instrumental or animal sound from audio. This paper discusses the various classifiers for speech detection from the audio signal and extracting the data through modules. The input audio signal is 3 secs, and ~60kb of size is given to Classifiers and compared the different performance metrics of Machine Learning Classifiers (MLC) for extracting the speech from the audio signal. The accuracy of speech detection is better in Stochastic gradient descent (SGD) than in other classifiers, 93%. Specificity, Sensitivity, and F1 scores were also calculated for speech detection. Receiver Operating Characteristic (ROC) of machine learning classifiers was calculated and compared. MATLAB is used to calculate and analyze the performance metrics for detection from the audio signal.","PeriodicalId":307414,"journal":{"name":"2022 IEEE VLSI Device Circuit and System (VLSI DCS)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE VLSI Device Circuit and System (VLSI DCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSIDCS53788.2022.9811452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Human-machine interaction is everywhere as technologies affecting audio, natural language processing, and machine vision evolve for artificial Intelligence (AI). Speech detection based on AI techniques can be used in devices or systems driven by voice and automatic speech recognition for security purposes or detecting specific sounds like instrumental or animal sound from audio. This paper discusses the various classifiers for speech detection from the audio signal and extracting the data through modules. The input audio signal is 3 secs, and ~60kb of size is given to Classifiers and compared the different performance metrics of Machine Learning Classifiers (MLC) for extracting the speech from the audio signal. The accuracy of speech detection is better in Stochastic gradient descent (SGD) than in other classifiers, 93%. Specificity, Sensitivity, and F1 scores were also calculated for speech detection. Receiver Operating Characteristic (ROC) of machine learning classifiers was calculated and compared. MATLAB is used to calculate and analyze the performance metrics for detection from the audio signal.

查看原文本刊更多论文

语音检测的机器学习分类器

随着影响音频、自然语言处理和机器视觉的技术向人工智能(AI)发展，人机交互无处不在。基于人工智能技术的语音检测可用于由语音驱动的设备或系统，以及出于安全目的的自动语音识别，或检测音频中的特定声音，如器乐或动物声音。本文讨论了从音频信号中检测语音并通过模块提取数据的各种分类器。输入音频信号为3秒，给出~60kb大小的分类器，并比较机器学习分类器(MLC)从音频信号中提取语音的不同性能指标。随机梯度下降(SGD)的语音检测准确率达到93%，优于其他分类器。还计算了语音检测的特异性、敏感性和F1评分。计算并比较机器学习分类器的Receiver Operating Characteristic (ROC)。利用MATLAB计算和分析了从音频信号中检测的性能指标。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE VLSI Device Circuit and System (VLSI DCS)

自引率

0.00%

发文量