Proceedings of the 14th Conference on ACM Multimedia Systems最新文献

筛选
英文 中文
Factors Influencing Video Quality of Experience in Ecologically Valid Experiments: Measurements and a Theoretical Mode 影响生态有效实验经验视频质量的因素:测量和理论模型
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3593027
Kamil Koniuch
{"title":"Factors Influencing Video Quality of Experience in Ecologically Valid Experiments: Measurements and a Theoretical Mode","authors":"Kamil Koniuch","doi":"10.1145/3587819.3593027","DOIUrl":"https://doi.org/10.1145/3587819.3593027","url":null,"abstract":"Users' perception of multimedia quality and satisfaction with multimedia services are the subject of various studies in the field of Quality of Experience (QoE). In this respect, subjective studies of quality represent an important part of the multimedia optimization process. However, researchers who measure QoE have to face its multidimensional character and address the fact that quality perception is influenced by numerous factors. To address this issue, experiments measuring QoE often limit the scope of factors influencing subjective judgments by administering laboratory protocols. However, the generalizability of the results gathered with such protocols is limited. The proposed PhD dissertation aims to address this challenge. In order to increase the generalizability of QoE studies we started with an identification of factors influencing user multimedia experience in a natural context. We proposed a new theoretical model of video QoE based on both original research and a literature review. This new theoretical framework allowed us to propose new experimental designs introducing influencing factors one by one in an additive manner. Thanks to the model, we can also propose comparable experiments which could differ in ecological validity. The proposed theoretical framework can be adjusted to other multimedia in the future.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"422 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122512197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Supervised Contrastive Learning for Robust Audio-Sheet Music Retrieval Systems 鲁棒乐谱检索系统的自监督对比学习
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3590968
Luis Carvalho, Tobias Washüttl, G. Widmer
{"title":"Self-Supervised Contrastive Learning for Robust Audio-Sheet Music Retrieval Systems","authors":"Luis Carvalho, Tobias Washüttl, G. Widmer","doi":"10.1145/3587819.3590968","DOIUrl":"https://doi.org/10.1145/3587819.3590968","url":null,"abstract":"Linking sheet music images to audio recordings remains a key problem for the development of efficient cross-modal music retrieval systems. One of the fundamental approaches toward this task is to learn a cross-modal embedding space via deep neural networks that is able to connect short snippets of audio and sheet music. However, the scarcity of annotated data from real musical content affects the capability of such methods to generalize to real retrieval scenarios. In this work, we investigate whether we can mitigate this limitation with self-supervised contrastive learning, by exposing a network to a large amount of real music data as a pre-training step, by contrasting randomly augmented views of snippets of both modalities, namely audio and sheet images. Through a number of experiments on synthetic and real piano data, we show that pretrained models are able to retrieve snippets with better precision in all scenarios and pre-training configurations. Encouraged by these results, we employ the snippet embeddings in the higher-level task of cross-modal piece identification and conduct more experiments on several retrieval configurations. In this task, we observe that the retrieval quality improves from 30% up to 100% when real music data is present. We then conclude by arguing for the potential of self-supervised contrastive learning for alleviating the annotated data scarcity in multi-modal music retrieval models. Code and trained models are accessible at https://github.com/luisfvc/ucasr.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122561685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The ADΔER Framework: Tools for Event Video Representations ADΔER框架:事件视频表示的工具
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3593028
Andrew C. Freeman
{"title":"The ADΔER Framework: Tools for Event Video Representations","authors":"Andrew C. Freeman","doi":"10.1145/3587819.3593028","DOIUrl":"https://doi.org/10.1145/3587819.3593028","url":null,"abstract":"The concept of \"video\" is synonymous with frame-sequence image representations. However, neuromorphic \"event\" cameras, which are rapidly gaining adoption for computer vision tasks, record frameless video. We believe that these different paradigms of video capture can each benefit from the lessons of the other. To usher in the next era of video systems and accommodate new event camera designs, we argue that we will need an asynchronous, source-agnostic processing pipeline. In this paper, we propose an end-to-end framework for frameless video, and we describe its modularity and amenability to compression and both existing and future applications.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115685817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
QoE- and Energy-aware Content Consumption For HTTP Adaptive Streaming HTTP自适应流的QoE和能量感知内容消费
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3593029
Daniele Lorenzi
{"title":"QoE- and Energy-aware Content Consumption For HTTP Adaptive Streaming","authors":"Daniele Lorenzi","doi":"10.1145/3587819.3593029","DOIUrl":"https://doi.org/10.1145/3587819.3593029","url":null,"abstract":"Video streaming services account for the majority of today's traffic on the Internet, and according to recent studies, this share is expected to continue growing. Given this broad utilization, research in video streaming is recently moving towards energy-aware approaches, which aim at reducing the energy consumption of the devices involved in the streaming process. On the other side, the perception of quality delivered to the user plays an important role, and the advent of HTTP Adaptive Streaming (HAS) changed the way quality is perceived. The focus is not any more exclusively on the Quality of Service (QoS) but rather oriented towards the Quality of Experience (QoE) of the user taking part in the streaming session. Therefore video streaming services need to develop Adaptive BitRate (ABR) techniques to deal with different network conditions on the client side or appropriate end-to-end strategies to provide high QoE to the users. The scope of this doctoral study is within the end-to-end environment with a focus on the end-users domain, referred to as the player environment, including video content consumption and interactivity. This thesis aims to investigate and develop different techniques to increase the delivered QoE to the users and minimize the energy consumption of the end devices in HAS context. We present four main research questions to target the related challenges in the domain of content consumption for HAS systems.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121422035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Dataset for User Visual Behaviour with Multi-View Video Content 基于多视点视频内容的用户视觉行为数据集
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592556
Tiago Soares da Costa, M. T. Andrade, Paula Viana, Nuno Castro Silva
{"title":"A Dataset for User Visual Behaviour with Multi-View Video Content","authors":"Tiago Soares da Costa, M. T. Andrade, Paula Viana, Nuno Castro Silva","doi":"10.1145/3587819.3592556","DOIUrl":"https://doi.org/10.1145/3587819.3592556","url":null,"abstract":"Immersive video applications impose unpractical bandwidth requirements for best-effort networks. With Multi-View (MV) streaming, these can be minimized by resorting to view prediction techniques. SmoothMV is a multi-view system that uses a non-intrusive head tracking mechanism to detect the viewer's interest and select appropriate views. By coupling Neural Networks (NNs) to anticipate the viewer's interest, a reduction of view-switching latency is likely to be obtained. The objective of this paper is twofold: 1) Present a solution for acquisition of gaze data from users when viewing MV content; 2) Describe a dataset, collected with a large-scale testbed, capable of being used to train NNs to predict the user's viewing interest. Tracking data from head movements was obtained from 45 participants using an Intel Realsense F200 camera, with 7 video playlists, each being viewed a minimum of 17 times. This dataset is publicly available to the research community and constitutes an important contribution to reducing the current scarcity of such data. Tools to obtain saliency/heat maps and generate complementary plots are also provided as an open-source software package.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125075065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Everybody Compose: Deep Beats To Music 大家作曲:深节拍音乐
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592542
Conghao Shen, Violet Z. Yao, Yixin Liu
{"title":"Everybody Compose: Deep Beats To Music","authors":"Conghao Shen, Violet Z. Yao, Yixin Liu","doi":"10.1145/3587819.3592542","DOIUrl":"https://doi.org/10.1145/3587819.3592542","url":null,"abstract":"This project presents a deep learning approach to generate monophonic melodies based on input beats, allowing even amateurs to create their own music compositions. Three effective methods - LSTM with Full Attention, LSTM with Local Attention, and Transformer with Relative Position Representation - are proposed for this novel task, providing great variation, harmony, and structure in the generated music. This project allows anyone to compose their own music by tapping their keyboards or \"recoloring\" beat sequences from existing works.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128165584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IDCIA: Immunocytochemistry Dataset for Cellular Image Analysis IDCIA:细胞图像分析的免疫细胞化学数据集
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592558
Abdurahman Ali Mohammed, Catherine Fonder, D. Sakaguchi, Wallapak Tavanapong, S. Mallapragada, A. Idris
{"title":"IDCIA: Immunocytochemistry Dataset for Cellular Image Analysis","authors":"Abdurahman Ali Mohammed, Catherine Fonder, D. Sakaguchi, Wallapak Tavanapong, S. Mallapragada, A. Idris","doi":"10.1145/3587819.3592558","DOIUrl":"https://doi.org/10.1145/3587819.3592558","url":null,"abstract":"We present a new annotated microscopic cellular image dataset to improve the effectiveness of machine learning methods for cellular image analysis. Cell counting is an important step in cell analysis. Typically, domain experts manually count cells in a microscopic image. Automated cell counting can potentially eliminate this tedious, time-consuming process. However, a good, labeled dataset is required for training an accurate machine learning model. Our dataset includes microscopic images of cells, and for each image, the cell count and the location of individual cells. The data were collected as part of an ongoing study investigating the potential of electrical stimulation to modulate stem cell differentiation and possible applications for neural repair. Compared to existing publicly available datasets, our dataset has more images of cells stained with more variety of antibodies (protein components of immune responses against invaders) typically used for cell analysis. The experimental results on this dataset indicate that none of the five existing models under this study are able to achieve sufficiently accurate count to replace the manual methods. The dataset is available at https://figshare.com/articles/dataset/Dataset/21970604.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132210735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VOLVQAD: An MPEG V-PCC Volumetric Video Quality Assessment Dataset VOLVQAD:一个MPEG V-PCC体积视频质量评估数据集
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592543
Samuel Rhys Cox, May Lim, Wei Tsang Ooi
{"title":"VOLVQAD: An MPEG V-PCC Volumetric Video Quality Assessment Dataset","authors":"Samuel Rhys Cox, May Lim, Wei Tsang Ooi","doi":"10.1145/3587819.3592543","DOIUrl":"https://doi.org/10.1145/3587819.3592543","url":null,"abstract":"We present VOLVQAD, a volumetric video quality assessment dataset consisting 7,680 ratings on 376 video sequences from 120 participants. The volumetric video sequences are first encoded with MPEG V-PCC using 4 different avatar models and 16 quality variations, and then rendered into test videos for quality assessment using 2 different background colors and 16 different quality switching patterns. The dataset is useful for researchers who wish to understand the impact of volumetric video compression on subjective quality. Analysis of the collected data are also presented in this paper.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121210017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VAST: A Decentralized Open-Source Publish/Subscribe Architecture VAST:一个分散的开源发布/订阅架构
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3592554
Victory Opeolu, H. Engelbrecht, Shun-Yun Hu, C. Marais
{"title":"VAST: A Decentralized Open-Source Publish/Subscribe Architecture","authors":"Victory Opeolu, H. Engelbrecht, Shun-Yun Hu, C. Marais","doi":"10.1145/3587819.3592554","DOIUrl":"https://doi.org/10.1145/3587819.3592554","url":null,"abstract":"Publish/Subscribe (pub/sub) systems have been widely adopted in highly scalable environments. We see this especially with IoT/IIoT applications, an environment where low bandwidth and high latency is ideal. The projected growth of Iot/IIoT network nodes are in the billions in the next few years and as such, there is a need for network communication standards that can adapt to the evergrowing nature of this industry. While current pub/sub standards have produced positive results so far, they all adopt a \"topic\" based pub/sub approach. They do not leverage off modern devices having spatial information. Current open-source standards also focus heavily on centralized brokering of information. This makes the broker in this system a potential bottleneck as it means if that broker goes down, the entire network goes down. We have developed a new, unique and innovative open-source pub/sub standard called VAST that leverages spatial information of modern network devices to perform message communication. It uses a unique concept called Spatial Publish/Subscribe (SPS). It is built on a peer-to-peer network to enable high scalability. In addition to this, it provides a Voronoi Overlay to efficiently distribute the messages, ensuring that network brokers are not overloaded with requests and ensures the network self-organizes itself if one or more brokers break down. It also has a forwarding algorithm to eliminate redundancies in the network. We will demonstrate this concept with a simulator we developed. We will show how the simulator works and how to use it. We believe that with this simulator, we will help encourage researchers adopt this technology for their spatial applications. An example of such is Massively Multi-user Virtual Environments (MMVEs), where there is a need for a high number of spatial network nodes in virtual environments.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129099023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FleXR: A System Enabling Flexibly Distributed Extended Reality FleXR:一个实现灵活分布式扩展现实的系统
Proceedings of the 14th Conference on ACM Multimedia Systems Pub Date : 2023-06-07 DOI: 10.1145/3587819.3590966
Jin Heo, Ketan Bhardwaj, Ada Gavrilovska
{"title":"FleXR: A System Enabling Flexibly Distributed Extended Reality","authors":"Jin Heo, Ketan Bhardwaj, Ada Gavrilovska","doi":"10.1145/3587819.3590966","DOIUrl":"https://doi.org/10.1145/3587819.3590966","url":null,"abstract":"Extended reality (XR) applications require computationally demanding functionalities with low end-to-end latency and high throughput. To enable XR on commodity devices, a number of distributed systems solutions enable offloading of XR workloads on remote servers. However, they make a priori decisions regarding the offloaded functionalities based on assumptions about operating factors, and their benefits are restricted to specific deployment contexts. To realize the benefits of offloading in various distributed environments, we present a distributed stream processing system, FleXR, which is specialized for real-time and interactive workloads and enables flexible distributions of XR functionalities. In building FleXR, we identified and resolved several issues of presenting XR functionalities as distributed pipelines. FleXR provides a framework for flexible distribution of XR pipelines while streamlining development and deployment phases. We evaluate FleXR with three XR use cases in four different distribution scenarios. In the results, the best-case distribution scenario shows up to 50% less end-to-end latency and 3.9x pipeline throughput compared to alternatives.","PeriodicalId":330983,"journal":{"name":"Proceedings of the 14th Conference on ACM Multimedia Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127311708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信