{"title":"美在这里:使用多模态特征和免费训练数据来评估视频的美学","authors":"Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang","doi":"10.1145/2502081.2508121","DOIUrl":null,"url":null,"abstract":"The aesthetics of videos can be used as a useful clue to improve user satisfaction in many applications such as search and recommendation. In this paper, we demonstrate a computational approach to automatically evaluate the aesthetics of videos, with particular emphasis on identifying beautiful scenes. Using a standard classification pipeline, we analyze the effectiveness of a comprehensive set of features, ranging from low-level visual features, mid-level semantic attributes, to style descriptors. In addition, since there is limited public training data with manual labels of video aesthetics, we explore freely available resources with a simple assumption that people tend to share more aesthetically appealing works than unappealing ones. Specifically, we use images from DPChallenge and videos from Flickr as positive training data and the Dutch documentary videos as negative data, where the latter contain mostly old materials of low visual quality. Our extensive evaluations show that combining multiple features is helpful, and very promising results can be obtained using the noisy but annotation-free training data. On the NHK Multimedia Challenge dataset, we attain a Spearman's rank correlation coefficient of 0.41.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Beauty is here: evaluating aesthetics in videos using multimodal features and free training data\",\"authors\":\"Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang\",\"doi\":\"10.1145/2502081.2508121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aesthetics of videos can be used as a useful clue to improve user satisfaction in many applications such as search and recommendation. In this paper, we demonstrate a computational approach to automatically evaluate the aesthetics of videos, with particular emphasis on identifying beautiful scenes. Using a standard classification pipeline, we analyze the effectiveness of a comprehensive set of features, ranging from low-level visual features, mid-level semantic attributes, to style descriptors. In addition, since there is limited public training data with manual labels of video aesthetics, we explore freely available resources with a simple assumption that people tend to share more aesthetically appealing works than unappealing ones. Specifically, we use images from DPChallenge and videos from Flickr as positive training data and the Dutch documentary videos as negative data, where the latter contain mostly old materials of low visual quality. Our extensive evaluations show that combining multiple features is helpful, and very promising results can be obtained using the noisy but annotation-free training data. On the NHK Multimedia Challenge dataset, we attain a Spearman's rank correlation coefficient of 0.41.\",\"PeriodicalId\":20448,\"journal\":{\"name\":\"Proceedings of the 21st ACM international conference on Multimedia\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2502081.2508121\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2502081.2508121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Beauty is here: evaluating aesthetics in videos using multimodal features and free training data
The aesthetics of videos can be used as a useful clue to improve user satisfaction in many applications such as search and recommendation. In this paper, we demonstrate a computational approach to automatically evaluate the aesthetics of videos, with particular emphasis on identifying beautiful scenes. Using a standard classification pipeline, we analyze the effectiveness of a comprehensive set of features, ranging from low-level visual features, mid-level semantic attributes, to style descriptors. In addition, since there is limited public training data with manual labels of video aesthetics, we explore freely available resources with a simple assumption that people tend to share more aesthetically appealing works than unappealing ones. Specifically, we use images from DPChallenge and videos from Flickr as positive training data and the Dutch documentary videos as negative data, where the latter contain mostly old materials of low visual quality. Our extensive evaluations show that combining multiple features is helpful, and very promising results can be obtained using the noisy but annotation-free training data. On the NHK Multimedia Challenge dataset, we attain a Spearman's rank correlation coefficient of 0.41.