使用频谱通量模式识别的语音/音频信号分类

Sangkil Lee, Jieun Kim, Insung Lee
{"title":"使用频谱通量模式识别的语音/音频信号分类","authors":"Sangkil Lee, Jieun Kim, Insung Lee","doi":"10.1109/SiPS.2012.36","DOIUrl":null,"url":null,"abstract":"In this paper, we present a novel method for the improvement of speech and audio signal classification using spectral flux (SF) pattern recognition for the MPEG Unified Speech and Audio Coding (USAC) standard. For effective pattern recognition, the Gaussian mixture model (GMM)probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM)algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and audio signals using SF pattern recognition. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.","PeriodicalId":286060,"journal":{"name":"2012 IEEE Workshop on Signal Processing Systems","volume":"525 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Speech/Audio Signal Classification Using Spectral Flux Pattern Recognition\",\"authors\":\"Sangkil Lee, Jieun Kim, Insung Lee\",\"doi\":\"10.1109/SiPS.2012.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a novel method for the improvement of speech and audio signal classification using spectral flux (SF) pattern recognition for the MPEG Unified Speech and Audio Coding (USAC) standard. For effective pattern recognition, the Gaussian mixture model (GMM)probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM)algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and audio signals using SF pattern recognition. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.\",\"PeriodicalId\":286060,\"journal\":{\"name\":\"2012 IEEE Workshop on Signal Processing Systems\",\"volume\":\"525 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE Workshop on Signal Processing Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SiPS.2012.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Workshop on Signal Processing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SiPS.2012.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

针对MPEG统一语音音频编码(USAC)标准,提出了一种利用谱通量(SF)模式识别改进语音音频信号分类的新方法。为了有效的模式识别,采用高斯混合模型(GMM)概率模型。对于最优的GMM参数提取,我们使用期望最大化(EM)算法。本文提出的分类算法分为两个重要部分。第一个是提取GMM的最优参数。第二个是使用SF模式识别来区分语音和音频信号。与传统的USAC分类算法相比,本文提出的分类算法具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Speech/Audio Signal Classification Using Spectral Flux Pattern Recognition
In this paper, we present a novel method for the improvement of speech and audio signal classification using spectral flux (SF) pattern recognition for the MPEG Unified Speech and Audio Coding (USAC) standard. For effective pattern recognition, the Gaussian mixture model (GMM)probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM)algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and audio signals using SF pattern recognition. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信