使用进化语音/音乐辨别改进音频编码

2007 IEEE International Fuzzy Systems Conference Pub Date : 2007-07-23 DOI:10.1109/FUZZY.2007.4295472

J. E. M. Expósito, S. G. Galán, Nicolas Ruiz Reyes, P. V. Candeas

{"title":"使用进化语音/音乐辨别改进音频编码","authors":"J. E. M. Expósito, S. G. Galán, Nicolas Ruiz Reyes, P. V. Candeas","doi":"10.1109/FUZZY.2007.4295472","DOIUrl":null,"url":null,"abstract":"Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimination field, aiming to improve the coding efficiency of standard audio coders (i.e. MP3, AAC) when speech and music signals are involved. In order to discriminate between speech and music, a fuzzy rules-based expert system is incorporated into the decision-taking stage of traditional speech/music discrimination systems. The knowledge base of the fuzzy expert system has been obtained by means of a typical genetic learning algorithm (the Pittsburgh algorithm). The proposed speech/music discrimination scheme manages the operation of an intelligent audio coder, which selects a GSM coder for speech frames and an AAC coder for music ones, resulting in a lower bit rate regarding the case of using a standardized audio coder (AAC in this work). Further, the intelligent audio coder has been designed aiming to obtain a similar subjective audio quality than AAC. GSM operates at 13 kbits/s, while in the experiments the bit rate specification for AAC has been 32 kbits/s for one-channel audio signals.","PeriodicalId":236515,"journal":{"name":"2007 IEEE International Fuzzy Systems Conference","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Audio Coding Improvement Using Evolutionary Speech/Music Discrimination\",\"authors\":\"J. E. M. Expósito, S. G. Galán, Nicolas Ruiz Reyes, P. V. Candeas\",\"doi\":\"10.1109/FUZZY.2007.4295472\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimination field, aiming to improve the coding efficiency of standard audio coders (i.e. MP3, AAC) when speech and music signals are involved. In order to discriminate between speech and music, a fuzzy rules-based expert system is incorporated into the decision-taking stage of traditional speech/music discrimination systems. The knowledge base of the fuzzy expert system has been obtained by means of a typical genetic learning algorithm (the Pittsburgh algorithm). The proposed speech/music discrimination scheme manages the operation of an intelligent audio coder, which selects a GSM coder for speech frames and an AAC coder for music ones, resulting in a lower bit rate regarding the case of using a standardized audio coder (AAC in this work). Further, the intelligent audio coder has been designed aiming to obtain a similar subjective audio quality than AAC. GSM operates at 13 kbits/s, while in the experiments the bit rate specification for AAC has been 32 kbits/s for one-channel audio signals.\",\"PeriodicalId\":236515,\"journal\":{\"name\":\"2007 IEEE International Fuzzy Systems Conference\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE International Fuzzy Systems Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FUZZY.2007.4295472\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Fuzzy Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FUZZY.2007.4295472","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

语音/音乐自动识别是许多多媒体应用中使用的重要工具，是近年来研究的热点。本文介绍了我们在语音/音乐识别领域的最新研究成果，旨在提高标准音频编码器(即MP3, AAC)在涉及语音和音乐信号时的编码效率。为了区分语音和音乐，在传统语音/音乐识别系统的决策阶段引入了基于模糊规则的专家系统。利用一种典型的遗传学习算法(匹兹堡算法)获得了模糊专家系统的知识库。提出的语音/音乐区分方案管理智能音频编码器的操作，该方案为语音帧选择GSM编码器，为音乐帧选择AAC编码器，从而在使用标准化音频编码器(本工作中为AAC)的情况下降低比特率。此外，设计了智能音频编码器，旨在获得与AAC相似的主观音频质量。GSM的工作速率为13kbits /s，而在实验中，AAC的单通道音频信号的比特率规范为32kbits /s。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Audio Coding Improvement Using Evolutionary Speech/Music Discrimination

Automatic speech/music discrimination is an important tool used in many multimedia applications, becoming a research topic of interest in the last years. This paper presents our last works in the speech/music discrimination field, aiming to improve the coding efficiency of standard audio coders (i.e. MP3, AAC) when speech and music signals are involved. In order to discriminate between speech and music, a fuzzy rules-based expert system is incorporated into the decision-taking stage of traditional speech/music discrimination systems. The knowledge base of the fuzzy expert system has been obtained by means of a typical genetic learning algorithm (the Pittsburgh algorithm). The proposed speech/music discrimination scheme manages the operation of an intelligent audio coder, which selects a GSM coder for speech frames and an AAC coder for music ones, resulting in a lower bit rate regarding the case of using a standardized audio coder (AAC in this work). Further, the intelligent audio coder has been designed aiming to obtain a similar subjective audio quality than AAC. GSM operates at 13 kbits/s, while in the experiments the bit rate specification for AAC has been 32 kbits/s for one-channel audio signals.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE International Fuzzy Systems Conference

自引率

0.00%

发文量