Lei Wang, Shen Huang, Shijin Wang, Jiaen Liang, Bo Xu
{"title":"基于多分类器融合的音乐类型分类","authors":"Lei Wang, Shen Huang, Shijin Wang, Jiaen Liang, Bo Xu","doi":"10.1109/ICNC.2008.815","DOIUrl":null,"url":null,"abstract":"Although researchers have made great progresses on music genre classification in recent years, the need for more accurate system is still not satisfied. In this paper, we propose a method for further reducing the classification error rate based on multiple classifier fusion. First of all, MFCCs and four features from MPEG-7 audio descriptor are extracted in every short time frame, and then a group of frames are gathered into a longer segment, in which mean and variance of these short time frames features are calculated. The segment is considered as the basic unit for training and testing module. Then random forest (RF) and multilayer perceptron neural network (MLP) are executed on such segment independently. Finally, a weighted voting fusion strategy is employed to fusion the result of the two classifiers on each segment, and the whole file decision is made by selecting the most frequently labeled genre over all the segments. Experiments showed that the approach is effective. The fusion result gets 12.4% relative reduction in error rate compared to our baseline system.","PeriodicalId":6404,"journal":{"name":"2008 Fourth International Conference on Natural Computation","volume":"49 1","pages":"580-583"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Music Genre Classification Based on Multiple Classifier Fusion\",\"authors\":\"Lei Wang, Shen Huang, Shijin Wang, Jiaen Liang, Bo Xu\",\"doi\":\"10.1109/ICNC.2008.815\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although researchers have made great progresses on music genre classification in recent years, the need for more accurate system is still not satisfied. In this paper, we propose a method for further reducing the classification error rate based on multiple classifier fusion. First of all, MFCCs and four features from MPEG-7 audio descriptor are extracted in every short time frame, and then a group of frames are gathered into a longer segment, in which mean and variance of these short time frames features are calculated. The segment is considered as the basic unit for training and testing module. Then random forest (RF) and multilayer perceptron neural network (MLP) are executed on such segment independently. Finally, a weighted voting fusion strategy is employed to fusion the result of the two classifiers on each segment, and the whole file decision is made by selecting the most frequently labeled genre over all the segments. Experiments showed that the approach is effective. The fusion result gets 12.4% relative reduction in error rate compared to our baseline system.\",\"PeriodicalId\":6404,\"journal\":{\"name\":\"2008 Fourth International Conference on Natural Computation\",\"volume\":\"49 1\",\"pages\":\"580-583\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Fourth International Conference on Natural Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNC.2008.815\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Fourth International Conference on Natural Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNC.2008.815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Music Genre Classification Based on Multiple Classifier Fusion
Although researchers have made great progresses on music genre classification in recent years, the need for more accurate system is still not satisfied. In this paper, we propose a method for further reducing the classification error rate based on multiple classifier fusion. First of all, MFCCs and four features from MPEG-7 audio descriptor are extracted in every short time frame, and then a group of frames are gathered into a longer segment, in which mean and variance of these short time frames features are calculated. The segment is considered as the basic unit for training and testing module. Then random forest (RF) and multilayer perceptron neural network (MLP) are executed on such segment independently. Finally, a weighted voting fusion strategy is employed to fusion the result of the two classifiers on each segment, and the whole file decision is made by selecting the most frequently labeled genre over all the segments. Experiments showed that the approach is effective. The fusion result gets 12.4% relative reduction in error rate compared to our baseline system.