Himadri Mukherjee, S. Obaidullah, K. Santosh, Teresa Gonçalves, S. Phadikar, K. Roy
{"title":"Segregation of Speech, Music and Instrumentals with LSF-RG features","authors":"Himadri Mukherjee, S. Obaidullah, K. Santosh, Teresa Gonçalves, S. Phadikar, K. Roy","doi":"10.1109/SKIMA.2018.8631533","DOIUrl":null,"url":null,"abstract":"Music based applications have undergone an evolution in the past decade. Development and optimization of Audio based search engines has attracted the interest of researchers for quite some time. Audio comes from multifarious sources in real world scenario which demand different processing techniques based on type. A system which can segregate audio based on type prior to searching can help in elevating the performance of the search engines. In this paper, a system is proposed towards segregation of speech, music and instrumental clips in order to aid towards performance enhancement of the search engines. The system works with a newly proposed Line Spectral Pair based feature namely Line Spectral Frequency-Ratio Grade(LSF-RG). The system has been tested on a database of as many as 105571 clips collected from the internet and different classifiers have been applied and a highest accuracy of 98.95% has been obtained for multi layer perceptron.","PeriodicalId":199576,"journal":{"name":"2018 12th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 12th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SKIMA.2018.8631533","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Music based applications have undergone an evolution in the past decade. Development and optimization of Audio based search engines has attracted the interest of researchers for quite some time. Audio comes from multifarious sources in real world scenario which demand different processing techniques based on type. A system which can segregate audio based on type prior to searching can help in elevating the performance of the search engines. In this paper, a system is proposed towards segregation of speech, music and instrumental clips in order to aid towards performance enhancement of the search engines. The system works with a newly proposed Line Spectral Pair based feature namely Line Spectral Frequency-Ratio Grade(LSF-RG). The system has been tested on a database of as many as 105571 clips collected from the internet and different classifiers have been applied and a highest accuracy of 98.95% has been obtained for multi layer perceptron.