使用进化语音/音乐辨别改进音频编码

J. E. M. Expósito, S. G. Galán, Nicolas Ruiz Reyes, P. V. Candeas
{"title":"使用进化语音/音乐辨别改进音频编码","authors":"J. E. M. Expósito, S. G. Galán, Nicolas Ruiz Reyes, P. V. Candeas","doi":"10.1109/FUZZY.2007.4295472","DOIUrl":null,"url":null,"abstract":"Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimination field, aiming to improve the coding efficiency of standard audio coders (i.e. MP3, AAC) when speech and music signals are involved. In order to discriminate between speech and music, a fuzzy rules-based expert system is incorporated into the decision-taking stage of traditional speech/music discrimination systems. The knowledge base of the fuzzy expert system has been obtained by means of a typical genetic learning algorithm (the Pittsburgh algorithm). The proposed speech/music discrimination scheme manages the operation of an intelligent audio coder, which selects a GSM coder for speech frames and an AAC coder for music ones, resulting in a lower bit rate regarding the case of using a standardized audio coder (AAC in this work). Further, the intelligent audio coder has been designed aiming to obtain a similar subjective audio quality than AAC. GSM operates at 13 kbits/s, while in the experiments the bit rate specification for AAC has been 32 kbits/s for one-channel audio signals.","PeriodicalId":236515,"journal":{"name":"2007 IEEE International Fuzzy Systems Conference","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Audio Coding Improvement Using Evolutionary Speech/Music Discrimination\",\"authors\":\"J. E. M. Expósito, S. G. Galán, Nicolas Ruiz Reyes, P. V. Candeas\",\"doi\":\"10.1109/FUZZY.2007.4295472\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimination field, aiming to improve the coding efficiency of standard audio coders (i.e. MP3, AAC) when speech and music signals are involved. In order to discriminate between speech and music, a fuzzy rules-based expert system is incorporated into the decision-taking stage of traditional speech/music discrimination systems. The knowledge base of the fuzzy expert system has been obtained by means of a typical genetic learning algorithm (the Pittsburgh algorithm). The proposed speech/music discrimination scheme manages the operation of an intelligent audio coder, which selects a GSM coder for speech frames and an AAC coder for music ones, resulting in a lower bit rate regarding the case of using a standardized audio coder (AAC in this work). Further, the intelligent audio coder has been designed aiming to obtain a similar subjective audio quality than AAC. GSM operates at 13 kbits/s, while in the experiments the bit rate specification for AAC has been 32 kbits/s for one-channel audio signals.\",\"PeriodicalId\":236515,\"journal\":{\"name\":\"2007 IEEE International Fuzzy Systems Conference\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE International Fuzzy Systems Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FUZZY.2007.4295472\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Fuzzy Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FUZZY.2007.4295472","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

摘要

语音/音乐自动识别是许多多媒体应用中使用的重要工具,是近年来研究的热点。本文介绍了我们在语音/音乐识别领域的最新研究成果,旨在提高标准音频编码器(即MP3, AAC)在涉及语音和音乐信号时的编码效率。为了区分语音和音乐,在传统语音/音乐识别系统的决策阶段引入了基于模糊规则的专家系统。利用一种典型的遗传学习算法(匹兹堡算法)获得了模糊专家系统的知识库。提出的语音/音乐区分方案管理智能音频编码器的操作,该方案为语音帧选择GSM编码器,为音乐帧选择AAC编码器,从而在使用标准化音频编码器(本工作中为AAC)的情况下降低比特率。此外,设计了智能音频编码器,旨在获得与AAC相似的主观音频质量。GSM的工作速率为13kbits /s,而在实验中,AAC的单通道音频信号的比特率规范为32kbits /s。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Audio Coding Improvement Using Evolutionary Speech/Music Discrimination
Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimination field, aiming to improve the coding efficiency of standard audio coders (i.e. MP3, AAC) when speech and music signals are involved. In order to discriminate between speech and music, a fuzzy rules-based expert system is incorporated into the decision-taking stage of traditional speech/music discrimination systems. The knowledge base of the fuzzy expert system has been obtained by means of a typical genetic learning algorithm (the Pittsburgh algorithm). The proposed speech/music discrimination scheme manages the operation of an intelligent audio coder, which selects a GSM coder for speech frames and an AAC coder for music ones, resulting in a lower bit rate regarding the case of using a standardized audio coder (AAC in this work). Further, the intelligent audio coder has been designed aiming to obtain a similar subjective audio quality than AAC. GSM operates at 13 kbits/s, while in the experiments the bit rate specification for AAC has been 32 kbits/s for one-channel audio signals.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信