{"title":"Machine Learning Classifiers for Speech Detection","authors":"Dasari Lakshmi Prasanna, S. Tripathi","doi":"10.1109/VLSIDCS53788.2022.9811452","DOIUrl":null,"url":null,"abstract":"Human-machine interaction is everywhere as technologies affecting audio, natural language processing, and machine vision evolve for artificial Intelligence (AI). Speech detection based on AI techniques can be used in devices or systems driven by voice and automatic speech recognition for security purposes or detecting specific sounds like instrumental or animal sound from audio. This paper discusses the various classifiers for speech detection from the audio signal and extracting the data through modules. The input audio signal is 3 secs, and ~60kb of size is given to Classifiers and compared the different performance metrics of Machine Learning Classifiers (MLC) for extracting the speech from the audio signal. The accuracy of speech detection is better in Stochastic gradient descent (SGD) than in other classifiers, 93%. Specificity, Sensitivity, and F1 scores were also calculated for speech detection. Receiver Operating Characteristic (ROC) of machine learning classifiers was calculated and compared. MATLAB is used to calculate and analyze the performance metrics for detection from the audio signal.","PeriodicalId":307414,"journal":{"name":"2022 IEEE VLSI Device Circuit and System (VLSI DCS)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE VLSI Device Circuit and System (VLSI DCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSIDCS53788.2022.9811452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Human-machine interaction is everywhere as technologies affecting audio, natural language processing, and machine vision evolve for artificial Intelligence (AI). Speech detection based on AI techniques can be used in devices or systems driven by voice and automatic speech recognition for security purposes or detecting specific sounds like instrumental or animal sound from audio. This paper discusses the various classifiers for speech detection from the audio signal and extracting the data through modules. The input audio signal is 3 secs, and ~60kb of size is given to Classifiers and compared the different performance metrics of Machine Learning Classifiers (MLC) for extracting the speech from the audio signal. The accuracy of speech detection is better in Stochastic gradient descent (SGD) than in other classifiers, 93%. Specificity, Sensitivity, and F1 scores were also calculated for speech detection. Receiver Operating Characteristic (ROC) of machine learning classifiers was calculated and compared. MATLAB is used to calculate and analyze the performance metrics for detection from the audio signal.