{"title":"专门用于检测音乐录音中的唤醒和价的音频功能","authors":"Jacek Grekow","doi":"10.1109/INISTA.2017.8001129","DOIUrl":null,"url":null,"abstract":"The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. In our approach, emotion recognition was treated as a regression problem and a two-dimensional valence-arousal model was used to measure emotions in music. We used features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval. We examined the influence of different feature sets - low-level, rhythm, tonal, and their combination - on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. We found and presented features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases.","PeriodicalId":314687,"journal":{"name":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Audio features dedicated to the detection of arousal and valence in music recordings\",\"authors\":\"Jacek Grekow\",\"doi\":\"10.1109/INISTA.2017.8001129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. In our approach, emotion recognition was treated as a regression problem and a two-dimensional valence-arousal model was used to measure emotions in music. We used features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval. We examined the influence of different feature sets - low-level, rhythm, tonal, and their combination - on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. We found and presented features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases.\",\"PeriodicalId\":314687,\"journal\":{\"name\":\"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INISTA.2017.8001129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INISTA.2017.8001129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Audio features dedicated to the detection of arousal and valence in music recordings
The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. In our approach, emotion recognition was treated as a regression problem and a two-dimensional valence-arousal model was used to measure emotions in music. We used features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval. We examined the influence of different feature sets - low-level, rhythm, tonal, and their combination - on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. We found and presented features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases.