基于分类算法的音乐偏好预测

Spring Simulation Multiconference Pub Date : 2018-04-15 DOI:10.22360/springsim.2018.cns.002

A. Barghi, Arman Ferdowsi, A. Abhari

{"title":"基于分类算法的音乐偏好预测","authors":"A. Barghi, Arman Ferdowsi, A. Abhari","doi":"10.22360/springsim.2018.cns.002","DOIUrl":null,"url":null,"abstract":"In this paper, we use several supervised classification algorithms to predict musical preference of a person. From psychological point of view, although personal emotion is an important feature that has an influence on selecting music, there are some other significant factors such as age, sex, education and district that might have an impact on our musical choices. In this paper, we first collected our data based on an observation method called stratified sampling. In this model, we collected 2000 cases that were grouped into strata (as district in our data feature), then simple random sampling was employed within each stratum. We partitioned our original dataset into two classes, 60% of which we were used to train our models and 40% of which we were held back as a validation dataset. The dataset contains five features as follows: four features named sex, age, education and district as explanatory variables and one feature named music known as response or target variable. The response variable has two different levels, namely traditional and non-traditional so we were dealing with a binary classification. The dataset that we created is called MPD. Moreover, we calculated some important statistical measures such as accuracy, specificity, precision, sensitivity and F-measure. Finally, we examined four different algorithms using R which were a nice mixture of nonlinear (cart, knn) and complex nonlinear methods (rf) and the result in random forest had the highest accuracy with 86.8%. We also observed that the highest F-measure is gained by cart algorithm with 44.7% score. As we have not considered the person's emotion as an influential factor on musical choices, we could expect the accuracy of learning algorithms would not react at very high performance. Our results proved this claim.","PeriodicalId":413389,"journal":{"name":"Spring Simulation Multiconference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Musical preferences prediction by classification algorithm\",\"authors\":\"A. Barghi, Arman Ferdowsi, A. Abhari\",\"doi\":\"10.22360/springsim.2018.cns.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we use several supervised classification algorithms to predict musical preference of a person. From psychological point of view, although personal emotion is an important feature that has an influence on selecting music, there are some other significant factors such as age, sex, education and district that might have an impact on our musical choices. In this paper, we first collected our data based on an observation method called stratified sampling. In this model, we collected 2000 cases that were grouped into strata (as district in our data feature), then simple random sampling was employed within each stratum. We partitioned our original dataset into two classes, 60% of which we were used to train our models and 40% of which we were held back as a validation dataset. The dataset contains five features as follows: four features named sex, age, education and district as explanatory variables and one feature named music known as response or target variable. The response variable has two different levels, namely traditional and non-traditional so we were dealing with a binary classification. The dataset that we created is called MPD. Moreover, we calculated some important statistical measures such as accuracy, specificity, precision, sensitivity and F-measure. Finally, we examined four different algorithms using R which were a nice mixture of nonlinear (cart, knn) and complex nonlinear methods (rf) and the result in random forest had the highest accuracy with 86.8%. We also observed that the highest F-measure is gained by cart algorithm with 44.7% score. As we have not considered the person's emotion as an influential factor on musical choices, we could expect the accuracy of learning algorithms would not react at very high performance. Our results proved this claim.\",\"PeriodicalId\":413389,\"journal\":{\"name\":\"Spring Simulation Multiconference\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Spring Simulation Multiconference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22360/springsim.2018.cns.002\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spring Simulation Multiconference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22360/springsim.2018.cns.002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

在本文中，我们使用几种监督分类算法来预测一个人的音乐偏好。从心理学的角度来看，虽然个人情感是影响音乐选择的一个重要特征，但还有一些其他重要因素，如年龄、性别、教育程度和地区，可能会影响我们的音乐选择。在本文中，我们首先基于一种称为分层抽样的观察方法收集数据。在该模型中，我们收集了2000个案例，将其分组为地层(在我们的数据特征中作为区域)，然后在每个地层内采用简单随机抽样。我们将原始数据集划分为两类，其中60%用于训练模型，40%用作验证数据集。该数据集包含以下五个特征:性别、年龄、教育和地区四个特征作为解释变量，一个名为音乐的特征被称为响应或目标变量。响应变量有两个不同的级别，即传统和非传统，因此我们处理的是二元分类。我们创建的数据集叫做MPD。此外，我们还计算了一些重要的统计指标，如准确性、特异性、精密度、灵敏度和f值。最后，我们使用R检查了四种不同的算法，这些算法是非线性(cart, knn)和复杂非线性方法(rf)的良好混合，随机森林的结果具有最高的准确率，为86.8%。我们还观察到，cart算法获得的f值最高，得分为44.7%。由于我们没有考虑到人的情绪是影响音乐选择的一个因素，我们可以预期学习算法的准确性不会在非常高的性能下做出反应。我们的研究结果证实了这一说法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Musical preferences prediction by classification algorithm

In this paper, we use several supervised classification algorithms to predict musical preference of a person. From psychological point of view, although personal emotion is an important feature that has an influence on selecting music, there are some other significant factors such as age, sex, education and district that might have an impact on our musical choices. In this paper, we first collected our data based on an observation method called stratified sampling. In this model, we collected 2000 cases that were grouped into strata (as district in our data feature), then simple random sampling was employed within each stratum. We partitioned our original dataset into two classes, 60% of which we were used to train our models and 40% of which we were held back as a validation dataset. The dataset contains five features as follows: four features named sex, age, education and district as explanatory variables and one feature named music known as response or target variable. The response variable has two different levels, namely traditional and non-traditional so we were dealing with a binary classification. The dataset that we created is called MPD. Moreover, we calculated some important statistical measures such as accuracy, specificity, precision, sensitivity and F-measure. Finally, we examined four different algorithms using R which were a nice mixture of nonlinear (cart, knn) and complex nonlinear methods (rf) and the result in random forest had the highest accuracy with 86.8%. We also observed that the highest F-measure is gained by cart algorithm with 44.7% score. As we have not considered the person's emotion as an influential factor on musical choices, we could expect the accuracy of learning algorithms would not react at very high performance. Our results proved this claim.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Spring Simulation Multiconference

自引率

0.00%

发文量