{"title":"基于梯度增强决策树的热门歌曲预测","authors":"Bang-Dang Pham, M. Tran, Hoang-Long Pham","doi":"10.1109/NICS51282.2020.9335886","DOIUrl":null,"url":null,"abstract":"Record companies invest billions of dollars in new talent around the globe each year. Gaining insight into what actually makes a hit song would provide tremendous benefits for the music industry. In this research, we tackle this question by focusing on predicting rank of hit songs in the next 6 months. Our dataset is used in ZALO AI CHALLENGE 2019 in Hit Song Prediction problem including not only songs but also its information such as composer, artist name, released date, etc. Because of that, while most previous work formulates hit song prediction as a regression or classification problem, we present in this paper how to apply Gradient Boosting technique to treat it as a ranking problem. The resulting best model has a good performance when predicting whether a song is a top 10 dance hit versus a lower listed position with 1.48815 Root Mean Square Error - our result dominates most of the solution in this competition (better than 3rd ranked solution of 87 in total). Moreover, it is possible to further improve by extracting chords, tones and more information from each song to obtain the highlights of songs and by using linguistics model to offer high-level features of metadata.","PeriodicalId":308944,"journal":{"name":"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hit Song Prediction based on Gradient Boosting Decision Tree\",\"authors\":\"Bang-Dang Pham, M. Tran, Hoang-Long Pham\",\"doi\":\"10.1109/NICS51282.2020.9335886\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Record companies invest billions of dollars in new talent around the globe each year. Gaining insight into what actually makes a hit song would provide tremendous benefits for the music industry. In this research, we tackle this question by focusing on predicting rank of hit songs in the next 6 months. Our dataset is used in ZALO AI CHALLENGE 2019 in Hit Song Prediction problem including not only songs but also its information such as composer, artist name, released date, etc. Because of that, while most previous work formulates hit song prediction as a regression or classification problem, we present in this paper how to apply Gradient Boosting technique to treat it as a ranking problem. The resulting best model has a good performance when predicting whether a song is a top 10 dance hit versus a lower listed position with 1.48815 Root Mean Square Error - our result dominates most of the solution in this competition (better than 3rd ranked solution of 87 in total). Moreover, it is possible to further improve by extracting chords, tones and more information from each song to obtain the highlights of songs and by using linguistics model to offer high-level features of metadata.\",\"PeriodicalId\":308944,\"journal\":{\"name\":\"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NICS51282.2020.9335886\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 7th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS51282.2020.9335886","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
唱片公司每年在全球的新人身上投资数十亿美元。深入了解热门歌曲的真正成因将为音乐产业带来巨大的好处。在这项研究中,我们通过预测未来6个月热门歌曲的排名来解决这个问题。我们的数据集用于ZALO AI CHALLENGE 2019的热门歌曲预测问题,不仅包括歌曲,还包括作曲家,艺术家姓名,发布日期等信息。正因为如此,虽然大多数先前的工作将热门歌曲预测作为回归或分类问题,但我们在本文中提出了如何应用梯度增强技术将其视为排序问题。所得到的最佳模型在预测一首歌曲是否是前10名热门舞曲时表现良好,相对于排名较低的歌曲,其均方根误差为1.48815——我们的结果在本次比赛中主导了大多数解决方案(优于排名第三的解决方案,总误差为87)。此外,还可以通过从每首歌曲中提取和弦、音调等更多信息来获得歌曲的亮点,并利用语言学模型提供元数据的高级特征来进一步改进。
Hit Song Prediction based on Gradient Boosting Decision Tree
Record companies invest billions of dollars in new talent around the globe each year. Gaining insight into what actually makes a hit song would provide tremendous benefits for the music industry. In this research, we tackle this question by focusing on predicting rank of hit songs in the next 6 months. Our dataset is used in ZALO AI CHALLENGE 2019 in Hit Song Prediction problem including not only songs but also its information such as composer, artist name, released date, etc. Because of that, while most previous work formulates hit song prediction as a regression or classification problem, we present in this paper how to apply Gradient Boosting technique to treat it as a ranking problem. The resulting best model has a good performance when predicting whether a song is a top 10 dance hit versus a lower listed position with 1.48815 Root Mean Square Error - our result dominates most of the solution in this competition (better than 3rd ranked solution of 87 in total). Moreover, it is possible to further improve by extracting chords, tones and more information from each song to obtain the highlights of songs and by using linguistics model to offer high-level features of metadata.