W. Bailer, C. Pike, Rik Bauwens, Reinhard Grandl, M. Matton, M. Thaler
{"title":"多传感器音乐会录音数据集,包括专业和用户生成的内容","authors":"W. Bailer, C. Pike, Rik Bauwens, Reinhard Grandl, M. Matton, M. Thaler","doi":"10.1145/2713168.2713191","DOIUrl":null,"url":null,"abstract":"We present a novel dataset for multi-view video and spatial audio. An ensemble of ten musicians from the BBC Philharmonic Orchestra performed in the orchestra's rehearsal studio in Salford, UK, on 25th March 2014. This presented a controlled environment in which to capture a dataset that could be used to simulate a large event, whilst allowing control over the conditions and performance. The dataset consists of hundreds of video and audio clips captured during 18 takes of performances, using a broad range of professional-and consumer-grade equipment, up to 4K video and high-end spatial microphones. In addition to the audiovisual essence, sensor metadata has been captured, and ground truth annotations, in particular for temporal synchronization and spatial alignment, have been created. A part of the dataset has also been prepared for adaptive content streaming. The dataset is released under a Creative Commons Attribution Non-Commercial Share Alike license and hosted on a specifically adapted content management platform.","PeriodicalId":202494,"journal":{"name":"Proceedings of the 6th ACM Multimedia Systems Conference","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Multi-sensor concert recording dataset including professional and user-generated content\",\"authors\":\"W. Bailer, C. Pike, Rik Bauwens, Reinhard Grandl, M. Matton, M. Thaler\",\"doi\":\"10.1145/2713168.2713191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a novel dataset for multi-view video and spatial audio. An ensemble of ten musicians from the BBC Philharmonic Orchestra performed in the orchestra's rehearsal studio in Salford, UK, on 25th March 2014. This presented a controlled environment in which to capture a dataset that could be used to simulate a large event, whilst allowing control over the conditions and performance. The dataset consists of hundreds of video and audio clips captured during 18 takes of performances, using a broad range of professional-and consumer-grade equipment, up to 4K video and high-end spatial microphones. In addition to the audiovisual essence, sensor metadata has been captured, and ground truth annotations, in particular for temporal synchronization and spatial alignment, have been created. A part of the dataset has also been prepared for adaptive content streaming. The dataset is released under a Creative Commons Attribution Non-Commercial Share Alike license and hosted on a specifically adapted content management platform.\",\"PeriodicalId\":202494,\"journal\":{\"name\":\"Proceedings of the 6th ACM Multimedia Systems Conference\",\"volume\":\"137 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 6th ACM Multimedia Systems Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2713168.2713191\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th ACM Multimedia Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2713168.2713191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-sensor concert recording dataset including professional and user-generated content
We present a novel dataset for multi-view video and spatial audio. An ensemble of ten musicians from the BBC Philharmonic Orchestra performed in the orchestra's rehearsal studio in Salford, UK, on 25th March 2014. This presented a controlled environment in which to capture a dataset that could be used to simulate a large event, whilst allowing control over the conditions and performance. The dataset consists of hundreds of video and audio clips captured during 18 takes of performances, using a broad range of professional-and consumer-grade equipment, up to 4K video and high-end spatial microphones. In addition to the audiovisual essence, sensor metadata has been captured, and ground truth annotations, in particular for temporal synchronization and spatial alignment, have been created. A part of the dataset has also been prepared for adaptive content streaming. The dataset is released under a Creative Commons Attribution Non-Commercial Share Alike license and hosted on a specifically adapted content management platform.