Proceedings of the 14th Conference on ACM Multimedia Systems最新文献_第5页

Machine-learning based VMAF prediction for HDR video content 基于机器学习的HDR视频内容VMAF预测

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3593941

Christoph Müller, Stephan Steglich, Sandra Groß, Paul Kremer

引用次数: 0

Vegvisir: A testing framework for HTTP/3 media streaming 一个HTTP/3媒体流的测试框架

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592550

Joris Herbots, M. Vandersanden, P. Quax, W. Lamotte

引用次数: 1

EVASR: Edge-Based Video Delivery with Salience-Aware Super-Resolution EVASR:基于边缘的视频传输与显著性感知超分辨率

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3590967

Na Li, Yao Liu

引用次数: 0

A Dynamic 3D Point Cloud Dataset for Immersive Applications 沉浸式应用的动态3D点云数据集

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592546

Yuan-Chun Sun, I-Chun Huang, Yuang Shi, Wei Tsang Ooi, Chun-Ying Huang, Cheng-Hsin Hsu

引用次数: 1

SEPE Dataset: 8K Video Sequences and Images for Analysis and Development SEPE数据集:用于分析和开发的8K视频序列和图像

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592560

Tariq Al Shoura, Ali Mollaahmadi Dehaghi, Reza Razavi, B. Far, Mohammad Moshirpour

引用次数: 0

"You AR' right in front of me": RGBD-based capture and rendering for remote training “你的AR就在我面前”:基于rgbd的远程训练捕获和渲染

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3593936

S. Gunkel, S. Dijkstra-Soudarissanane, O. Niamut

{"title":"\"You AR' right in front of me\": RGBD-based capture and rendering for remote training","authors":"S. Gunkel, S. Dijkstra-Soudarissanane, O. Niamut","doi":"10.1145/3587819.3593936","DOIUrl":"https://doi.org/10.1145/3587819.3593936","url":null,"abstract":"Immersive technologies such as virtual reality have enabled novel forms of education and training, where students can learn new skills in simulated environments. But some specialized training procedures, e.g. ESA-certified soldering, still involve real-world physical processes with physical lab equipment. Such training sessions require students to travel to teaching labs and may interrupt everyday commitments for a longer period of time. There is a desire to make such training procedures more accessible remotely while keeping any student-to-teacher interaction natural, personal, and engaging. This paper presents a prototype for a remote teaching use case by rendering 3D photorealistic representations into the Augmented Reality (AR) glasses of a student. The teacher is captured with a modular RGBD capture application integrated into a web-based immersive communication platform. The integration offers multiple real-time capture calibration and rendering configurations. Our modular platform allows for an easy evaluation of different technical constraints as well as easy testing of the use case itself. Such evaluation may include a direct comparison of different 3D point-cloud and mesh rendering techniques. Additionally, the overall system allows immersive interaction between the student and the teacher, including augmented text messages for non-intrusive notifications. Our platform offers an ideal testbed for both technical and user-centered immersive communication studies.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131255841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Perceptual annotation of local distortions in videos: tools and datasets 视频中局部失真的感知标注:工具和数据集

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592559

Andréas Pastor, P. Le Callet

{"title":"Perceptual annotation of local distortions in videos: tools and datasets","authors":"Andréas Pastor, P. Le Callet","doi":"10.1145/3587819.3592559","DOIUrl":"https://doi.org/10.1145/3587819.3592559","url":null,"abstract":"To assess the quality of multimedia content, create datasets, and train objective quality metrics, one needs to collect subjective opinions from annotators. Different subjective methodologies exist, from direct rating with single or double stimuli to indirect rating with pairwise comparisons. Triplet and quadruplet-based comparisons are a type of indirect rating. From these comparisons and preferences on stimuli, we can place the assessed stimuli on a perceptual scale (e.g., from low to high quality). Maximum Likelihood Difference Scaling (MLDS) solver is one of these algorithms working with triplets and quadruplets. A participant is asked to compare intervals inside pairs of stimuli: (a,b) and (c,d), where a,b,c,d are stimuli forming a quadruplet. However, one limitation is that the perceptual scales retrieved from stimuli of different contents are usually not comparable. We previously offered a solution to measure the inter-content scale of multiple contents. This paper presents an open-source python implementation of the method and demonstrates its use on three datasets collected in an in-lab environment. We compared the accuracy and effectiveness of the method using pairwise, triplet, and quadruplet for intra-content annotations. The code is available here: https://github.com/andreaspastor/MLDS_inter_content_scaling.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127396711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Video Decoding Performance and Requirements for XR Applications XR应用的视频解码性能和要求

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3593940

Emmanouil Potetsianakis, E. Thomas

{"title":"Video Decoding Performance and Requirements for XR Applications","authors":"Emmanouil Potetsianakis, E. Thomas","doi":"10.1145/3587819.3593940","DOIUrl":"https://doi.org/10.1145/3587819.3593940","url":null,"abstract":"Designing XR applications creates challenges regarding the performance and the scaling of media decoding operations, composition and synchronization of the various assets. Going beyond the single decoder paradigm of conventional video applications, XR applications tend to compose more and more visual streams such as 2D video assets but also textures and 2D/3D graphics encoded in video streams. All this demands a robust and predictable decoder management and a dynamic buffer organization. However, the behaviour of multiple decoder instances running in parallel is yet to be well understood on mobile platforms. To this end, we present in this paper VidBench - a parallel video decoding performance measurement tool for mobile Android devices. With VidBench, we quantify the challenges for applications using parallel video decoding pipelines with objective measurements and subjectively, we illustrate the current state of decoding multiple media streams and the possible visual artefacts resulting from unmanaged parallel video pipelines. Test results provide hints on the feasibility and the potential performance gain of using technologies like the MPEG-I Part 13 - Video Decoding Interface for immersive media (VDI) to alleviate those problems. We briefly present the main goals of VDI, standardised by the SC29 WG3 Moving Picture Experts Group (MPEG) Systems, which introduces functions and related constraints for optimizing such decoding instances as well as relevant video decoding APIs on which VDI is building upon such as the Khronos Vulkan Video extension.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114995302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Color-aware Deep Temporal Backdrop Duplex Matting System 颜色感知深时间背景双工抠图系统

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-05 DOI: 10.1145/3587819.3590973

Hendrik Hachmann, B. Rosenhahn

{"title":"Color-aware Deep Temporal Backdrop Duplex Matting System","authors":"Hendrik Hachmann, B. Rosenhahn","doi":"10.1145/3587819.3590973","DOIUrl":"https://doi.org/10.1145/3587819.3590973","url":null,"abstract":"Deep learning-based alpha matting showed tremendous improvements in recent years, yet, feature film production studios still rely on classical chroma keying including costly post-production steps. This perceived discrepancy can be explained by some missing links necessary for production which are currently not adequately addressed in the alpha matting community, in particular foreground color estimation or color spill compensation. We propose a neural network-based temporal multi-backdrop production system that combines beneficial features from chroma keying and alpha matting. Given two consecutive frames with different background colors, our one-encoder-dual-decoder network predicts foreground colors and alpha values using a patch-based overlap-blend approach. The system is able to handle imprecise backdrops, dynamic cameras, and dynamic foregrounds and has no restrictions on foreground colors. We compare our method to state-of-the-art algorithms using benchmark datasets and a video sequence captured by a demonstrator setup. We verify that a dual backdrop input is superior to the usually applied trimap-based approach. In addition, the proposed studio set is actor friendly, and produces high-quality, temporal consistent alpha and color estimations that include a superior color spill compensation.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134593337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TotalDefMeme: A Multi-Attribute Meme dataset on Total Defence in Singapore TotalDefMeme:新加坡全面防御的多属性Meme数据集

Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-05-29 DOI: 10.1145/3587819.3592545

Nirmalendu Prakash, Ming Shan Hee, R. Lee

{"title":"TotalDefMeme: A Multi-Attribute Meme dataset on Total Defence in Singapore","authors":"Nirmalendu Prakash, Ming Shan Hee, R. Lee","doi":"10.1145/3587819.3592545","DOIUrl":"https://doi.org/10.1145/3587819.3592545","url":null,"abstract":"Total Defence is a defence policy combining and extending the concept of military defence and civil defence. While several countries have adopted total defence as their defence policy, very few studies have investigated its effectiveness. With the rapid proliferation of social media and digitalisation, many social studies have been focused on investigating policy effectiveness through specially curated surveys and questionnaires either through digital media or traditional forms. However, such references may not truly reflect the underlying sentiments about the target policies or initiatives of interest. People are more likely to express their sentiment using communication mediums such as starting topic thread on forums or sharing memes on social media. Using Singapore as a case reference, this study aims to address this research gap by proposing TotalDefMeme, a large-scale multi-modal and multi-attribute meme dataset that captures public sentiments toward Singapore's Total Defence policy. Besides supporting social informatics and public policy analysis of the Total Defence policy, TotalDefMeme can also support many downstream multi-modal machine learning tasks, such as aspect-based stance classification and multi-modal meme clustering. We perform baseline machine learning experiments on TotalDefMeme and evaluate its technical validity, and present possible future interdisciplinary research directions and application scenarios using the dataset as a baseline.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130344836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3