{"title":"HSP-TL: A deep metric learning model with triplet loss for hit song prediction using lyrics and audio features","authors":"Petros Vavaroutsos, Pantelis Vikatos","doi":"10.1016/j.sctalk.2024.100363","DOIUrl":null,"url":null,"abstract":"<div><p>The music industry is interested in the future success of a song and its presence in popular rankings such as the Billboard charts. However, a song's popularity might be impacted by variables such as music trends and social influences, which are indifferent to audio signals. In this paper, we present HSP-TL, a deep learning model, to identify likely hit songs. Our work combines temporal information and features derived from audio and lyrics to estimate the success of a recording. We adopt the concept of the triplet loss function to minimize the distance between objects with similar popularity. Also, we use convolutional neural networks on 2-D low-level audio features, contrary to the current approach. We use pre-trained language models for text-based feature extraction. Our method is evaluated on the Hit Song Prediction Dataset, which we enrich with the lyrics of each song. Our results show that the inclusion of lyrics improves song uniqueness and reflects musical trends. The proposed model outperforms the current approach by up to 8%.</p></div>","PeriodicalId":101148,"journal":{"name":"Science Talks","volume":"10 ","pages":"Article 100363"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772569324000719/pdfft?md5=1eb70505cb6b3f3d64cd43cf36fa1199&pid=1-s2.0-S2772569324000719-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Talks","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772569324000719","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The music industry is interested in the future success of a song and its presence in popular rankings such as the Billboard charts. However, a song's popularity might be impacted by variables such as music trends and social influences, which are indifferent to audio signals. In this paper, we present HSP-TL, a deep learning model, to identify likely hit songs. Our work combines temporal information and features derived from audio and lyrics to estimate the success of a recording. We adopt the concept of the triplet loss function to minimize the distance between objects with similar popularity. Also, we use convolutional neural networks on 2-D low-level audio features, contrary to the current approach. We use pre-trained language models for text-based feature extraction. Our method is evaluated on the Hit Song Prediction Dataset, which we enrich with the lyrics of each song. Our results show that the inclusion of lyrics improves song uniqueness and reflects musical trends. The proposed model outperforms the current approach by up to 8%.