Machine Learning Classifiers for Speech Detection

Dasari Lakshmi Prasanna, S. Tripathi
{"title":"Machine Learning Classifiers for Speech Detection","authors":"Dasari Lakshmi Prasanna, S. Tripathi","doi":"10.1109/VLSIDCS53788.2022.9811452","DOIUrl":null,"url":null,"abstract":"Human-machine interaction is everywhere as technologies affecting audio, natural language processing, and machine vision evolve for artificial Intelligence (AI). Speech detection based on AI techniques can be used in devices or systems driven by voice and automatic speech recognition for security purposes or detecting specific sounds like instrumental or animal sound from audio. This paper discusses the various classifiers for speech detection from the audio signal and extracting the data through modules. The input audio signal is 3 secs, and ~60kb of size is given to Classifiers and compared the different performance metrics of Machine Learning Classifiers (MLC) for extracting the speech from the audio signal. The accuracy of speech detection is better in Stochastic gradient descent (SGD) than in other classifiers, 93%. Specificity, Sensitivity, and F1 scores were also calculated for speech detection. Receiver Operating Characteristic (ROC) of machine learning classifiers was calculated and compared. MATLAB is used to calculate and analyze the performance metrics for detection from the audio signal.","PeriodicalId":307414,"journal":{"name":"2022 IEEE VLSI Device Circuit and System (VLSI DCS)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE VLSI Device Circuit and System (VLSI DCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSIDCS53788.2022.9811452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Human-machine interaction is everywhere as technologies affecting audio, natural language processing, and machine vision evolve for artificial Intelligence (AI). Speech detection based on AI techniques can be used in devices or systems driven by voice and automatic speech recognition for security purposes or detecting specific sounds like instrumental or animal sound from audio. This paper discusses the various classifiers for speech detection from the audio signal and extracting the data through modules. The input audio signal is 3 secs, and ~60kb of size is given to Classifiers and compared the different performance metrics of Machine Learning Classifiers (MLC) for extracting the speech from the audio signal. The accuracy of speech detection is better in Stochastic gradient descent (SGD) than in other classifiers, 93%. Specificity, Sensitivity, and F1 scores were also calculated for speech detection. Receiver Operating Characteristic (ROC) of machine learning classifiers was calculated and compared. MATLAB is used to calculate and analyze the performance metrics for detection from the audio signal.
语音检测的机器学习分类器
随着影响音频、自然语言处理和机器视觉的技术向人工智能(AI)发展,人机交互无处不在。基于人工智能技术的语音检测可用于由语音驱动的设备或系统,以及出于安全目的的自动语音识别,或检测音频中的特定声音,如器乐或动物声音。本文讨论了从音频信号中检测语音并通过模块提取数据的各种分类器。输入音频信号为3秒,给出~60kb大小的分类器,并比较机器学习分类器(MLC)从音频信号中提取语音的不同性能指标。随机梯度下降(SGD)的语音检测准确率达到93%,优于其他分类器。还计算了语音检测的特异性、敏感性和F1评分。计算并比较机器学习分类器的Receiver Operating Characteristic (ROC)。利用MATLAB计算和分析了从音频信号中检测的性能指标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信