Cedric Konan, H. Suwa, Yutaka Arakawa, K. Yasumoto
{"title":"EmoBGM:估计声音的情感,用合适的BGM创建幻灯片","authors":"Cedric Konan, H. Suwa, Yutaka Arakawa, K. Yasumoto","doi":"10.1109/PERCOMW.2017.7917587","DOIUrl":null,"url":null,"abstract":"This paper presents a study about estimating the emotions conveyed in clips of background music (BGM) to be used in an automatic slideshow creation system. The system we aimed to develop, automatically tags each given pieces of background music with the main emotion it conveys, in order to recommend the most suitable music clip to the slideshow creators, based on the main emotions of embedded photos. As a first step of our research, we developed a machine learning model to estimate the emotions conveyed in a music clip and achieved 88% classification accuracy with cross-validation technique. The second part of our work involved developing a web application using Microsoft Emotion API to determine the emotions in photos, so the system can find the best candidate music for each photo in the slideshow. 16 users rated the recommended background music for a set of photos using a 5-point likert scale and we achieved an average rate of 4.1, 3.6 and 3.0 for the photo sets 1, 2, and 3 respectively of our evaluation task.","PeriodicalId":319638,"journal":{"name":"2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EmoBGM: Estimating sound's emotion for creating slideshows with suitable BGM\",\"authors\":\"Cedric Konan, H. Suwa, Yutaka Arakawa, K. Yasumoto\",\"doi\":\"10.1109/PERCOMW.2017.7917587\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a study about estimating the emotions conveyed in clips of background music (BGM) to be used in an automatic slideshow creation system. The system we aimed to develop, automatically tags each given pieces of background music with the main emotion it conveys, in order to recommend the most suitable music clip to the slideshow creators, based on the main emotions of embedded photos. As a first step of our research, we developed a machine learning model to estimate the emotions conveyed in a music clip and achieved 88% classification accuracy with cross-validation technique. The second part of our work involved developing a web application using Microsoft Emotion API to determine the emotions in photos, so the system can find the best candidate music for each photo in the slideshow. 16 users rated the recommended background music for a set of photos using a 5-point likert scale and we achieved an average rate of 4.1, 3.6 and 3.0 for the photo sets 1, 2, and 3 respectively of our evaluation task.\",\"PeriodicalId\":319638,\"journal\":{\"name\":\"2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PERCOMW.2017.7917587\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PERCOMW.2017.7917587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
EmoBGM: Estimating sound's emotion for creating slideshows with suitable BGM
This paper presents a study about estimating the emotions conveyed in clips of background music (BGM) to be used in an automatic slideshow creation system. The system we aimed to develop, automatically tags each given pieces of background music with the main emotion it conveys, in order to recommend the most suitable music clip to the slideshow creators, based on the main emotions of embedded photos. As a first step of our research, we developed a machine learning model to estimate the emotions conveyed in a music clip and achieved 88% classification accuracy with cross-validation technique. The second part of our work involved developing a web application using Microsoft Emotion API to determine the emotions in photos, so the system can find the best candidate music for each photo in the slideshow. 16 users rated the recommended background music for a set of photos using a 5-point likert scale and we achieved an average rate of 4.1, 3.6 and 3.0 for the photo sets 1, 2, and 3 respectively of our evaluation task.