Segregation of Speech, Music and Instrumentals with LSF-RG features

Himadri Mukherjee, S. Obaidullah, K. Santosh, Teresa Gonçalves, S. Phadikar, K. Roy
{"title":"Segregation of Speech, Music and Instrumentals with LSF-RG features","authors":"Himadri Mukherjee, S. Obaidullah, K. Santosh, Teresa Gonçalves, S. Phadikar, K. Roy","doi":"10.1109/SKIMA.2018.8631533","DOIUrl":null,"url":null,"abstract":"Music based applications have undergone an evolution in the past decade. Development and optimization of Audio based search engines has attracted the interest of researchers for quite some time. Audio comes from multifarious sources in real world scenario which demand different processing techniques based on type. A system which can segregate audio based on type prior to searching can help in elevating the performance of the search engines. In this paper, a system is proposed towards segregation of speech, music and instrumental clips in order to aid towards performance enhancement of the search engines. The system works with a newly proposed Line Spectral Pair based feature namely Line Spectral Frequency-Ratio Grade(LSF-RG). The system has been tested on a database of as many as 105571 clips collected from the internet and different classifiers have been applied and a highest accuracy of 98.95% has been obtained for multi layer perceptron.","PeriodicalId":199576,"journal":{"name":"2018 12th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 12th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SKIMA.2018.8631533","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Music based applications have undergone an evolution in the past decade. Development and optimization of Audio based search engines has attracted the interest of researchers for quite some time. Audio comes from multifarious sources in real world scenario which demand different processing techniques based on type. A system which can segregate audio based on type prior to searching can help in elevating the performance of the search engines. In this paper, a system is proposed towards segregation of speech, music and instrumental clips in order to aid towards performance enhancement of the search engines. The system works with a newly proposed Line Spectral Pair based feature namely Line Spectral Frequency-Ratio Grade(LSF-RG). The system has been tested on a database of as many as 105571 clips collected from the internet and different classifiers have been applied and a highest accuracy of 98.95% has been obtained for multi layer perceptron.
具有LSF-RG特征的语音、音乐和乐器分离
在过去的十年里,基于音乐的应用程序经历了一次进化。基于音频的搜索引擎的开发和优化已经引起了研究人员的兴趣。在现实世界场景中,音频来自多种来源,需要基于类型的不同处理技术。在搜索前基于类型分离音频的系统有助于提高搜索引擎的性能。为了提高搜索引擎的性能,本文提出了一个语音、音乐和器乐片段分离的系统。该系统采用了一种新提出的基于线谱对的特征,即线谱频率比等级(LSF-RG)。该系统在一个从互联网上收集的多达105571个视频片段的数据库上进行了测试,并应用了不同的分类器,多层感知器的准确率达到了98.95%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信