{"title":"Audio features dedicated to the detection of arousal and valence in music recordings","authors":"Jacek Grekow","doi":"10.1109/INISTA.2017.8001129","DOIUrl":null,"url":null,"abstract":"The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. In our approach, emotion recognition was treated as a regression problem and a two-dimensional valence-arousal model was used to measure emotions in music. We used features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval. We examined the influence of different feature sets - low-level, rhythm, tonal, and their combination - on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. We found and presented features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases.","PeriodicalId":314687,"journal":{"name":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INISTA.2017.8001129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. In our approach, emotion recognition was treated as a regression problem and a two-dimensional valence-arousal model was used to measure emotions in music. We used features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval. We examined the influence of different feature sets - low-level, rhythm, tonal, and their combination - on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. We found and presented features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases.