R. Rajan, Joshua Antony, Riya Ann Joseph, Jijohn M. Thomas, Chandr Dhanush H, A. V
{"title":"基于声-文特征融合的音频-情绪分类","authors":"R. Rajan, Joshua Antony, Riya Ann Joseph, Jijohn M. Thomas, Chandr Dhanush H, A. V","doi":"10.1109/ICMSS53060.2021.9673592","DOIUrl":null,"url":null,"abstract":"Listeners browse songs based on artist or genre, but a significant amount of queries are based on emotions like happy, sad, calm etc. and therefore, automatic music mood classification is gaining importance. People search for songs based on the emotions they are feeling or the emotion they hope to feel. Audio-based techniques can achieve satisfying results, but part of the semantic information of songs resides exclusively in the lyrics. In this paper, we present a study on the fusion approach of music mood classification. As both audio and lyrical information is complimentary, creating a hybrid model to classify music based on mood provides enhanced accuracy. Where a single song might fall under two different categories based on audio or lyrical information, a hybrid model helps us achieve more accurate results by merging both the information. In this work, we extracted features using librosa from audio, used TF-IDF for text, and experimented with the Bi-LSTM network. The performance evaluation is done on corpus consists of 776 songs. The multimodal approach achieved average precision, recall and F1-score of 0.66, 0.65 and 0.65 respectively.","PeriodicalId":274597,"journal":{"name":"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Audio-Mood Classification Using Acoustic-Textual Feature Fusion\",\"authors\":\"R. Rajan, Joshua Antony, Riya Ann Joseph, Jijohn M. Thomas, Chandr Dhanush H, A. V\",\"doi\":\"10.1109/ICMSS53060.2021.9673592\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Listeners browse songs based on artist or genre, but a significant amount of queries are based on emotions like happy, sad, calm etc. and therefore, automatic music mood classification is gaining importance. People search for songs based on the emotions they are feeling or the emotion they hope to feel. Audio-based techniques can achieve satisfying results, but part of the semantic information of songs resides exclusively in the lyrics. In this paper, we present a study on the fusion approach of music mood classification. As both audio and lyrical information is complimentary, creating a hybrid model to classify music based on mood provides enhanced accuracy. Where a single song might fall under two different categories based on audio or lyrical information, a hybrid model helps us achieve more accurate results by merging both the information. In this work, we extracted features using librosa from audio, used TF-IDF for text, and experimented with the Bi-LSTM network. The performance evaluation is done on corpus consists of 776 songs. The multimodal approach achieved average precision, recall and F1-score of 0.66, 0.65 and 0.65 respectively.\",\"PeriodicalId\":274597,\"journal\":{\"name\":\"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMSS53060.2021.9673592\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMSS53060.2021.9673592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Audio-Mood Classification Using Acoustic-Textual Feature Fusion
Listeners browse songs based on artist or genre, but a significant amount of queries are based on emotions like happy, sad, calm etc. and therefore, automatic music mood classification is gaining importance. People search for songs based on the emotions they are feeling or the emotion they hope to feel. Audio-based techniques can achieve satisfying results, but part of the semantic information of songs resides exclusively in the lyrics. In this paper, we present a study on the fusion approach of music mood classification. As both audio and lyrical information is complimentary, creating a hybrid model to classify music based on mood provides enhanced accuracy. Where a single song might fall under two different categories based on audio or lyrical information, a hybrid model helps us achieve more accurate results by merging both the information. In this work, we extracted features using librosa from audio, used TF-IDF for text, and experimented with the Bi-LSTM network. The performance evaluation is done on corpus consists of 776 songs. The multimodal approach achieved average precision, recall and F1-score of 0.66, 0.65 and 0.65 respectively.