{"title":"Enabling Wide Adoption of Hyperspectral Imaging","authors":"N. Sharma","doi":"10.1145/3458305.3478465","DOIUrl":"https://doi.org/10.1145/3458305.3478465","url":null,"abstract":"Hyperspectral imaging systems capture information in multiple wavelength bands across the electromagnetic spectrum, providing substantial details of the materials present in the captured scene. However, the high cost of hyperspectral cameras makes the technology out of reach for end-user and small-scale commercial applications. The goal of my research is to enable hyperspectral imaging on mobile devices. In this extended abstract, I present the direction of research that I have followed during the first half of my PhD, along with ideas and work in progress for the second half. I propose a new system, called MobiSpectral, that turns a mobile device to a simple (hyper) spectral camera by extending its spectral sensitivity. I design new APIs for developers to write hyperspectral mobile applications. My main API is based on a deep-learning model to convert the captured images to hyperspectral images with multiple bands across the entire visible and near-infrared spectral range, revealing hidden information and enabling a myriad of new applications on mobile devices. My method is robust and can work in different illumination conditions.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129170868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Server and Client-Side Algorithms for Adaptive Streaming of Non-Immersive and Immersive Media","authors":"Mehmet N. Akcay","doi":"10.1145/3458305.3478461","DOIUrl":"https://doi.org/10.1145/3458305.3478461","url":null,"abstract":"HTTP adaptive streaming is a technique widely used in the internet today to stream live and on-demand content. Server and client-side algorithms play an important role in achieving a better user experience in terms of metrics such as latency, rebufferings and rendering quality. In this doctoral study, we propose and evaluate a number of new algorithms for both non-immersive and immersive media in different settings ranging from low-latency live to on-demand streaming.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122322149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance of Low-Latency HTTP-based Streaming Players","authors":"Bo Zhang, Thiago Teixeira, Y. Reznik","doi":"10.1145/3458305.3478442","DOIUrl":"https://doi.org/10.1145/3458305.3478442","url":null,"abstract":"Reducing end-to-end streaming latency is critical to HTTP-based live video streaming. There are currently two technologies in this domain: Low-Latency HTTP Live Streaming (LL-HLS) and Low-Latency Dynamic Adaptive Streaming over HTTP (LL-DASH). Many players support LL-HLS and/or LL-DASH protocols, including Apple's AVPlayer, Shaka player, HLS.js Dash.js, and others. This paper is dedicated to the analysis of the performance of low-latency players and streaming protocols. The evaluation is based on a series of live streaming experiments, repeated using identical video content, encoders, encoding profiles, and network conditions, emulated by using traces of real-world networks. Several performance metrics, such as average stream bitrate, the amounts of downloaded media data, streaming latency, as well as buffering and stream switching statistics are captured and reported in our experiments. These results are subsequently used to describe the observed differences in the performance of LL-HLS and LL-DASH-based players.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130745975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Streaming Playback Statistics Dataset","authors":"Thiago Teixeira, Bo Zhang, Y. Reznik","doi":"10.1145/3458305.3478444","DOIUrl":"https://doi.org/10.1145/3458305.3478444","url":null,"abstract":"We propose dataset capturing statistics of several large-scale real-world streaming events, delivering videos to different devices (TVs, desktops, mobiles, tablets, etc.), and over different networks (from 2.5G, 3G, and other early generation mobile networks to 5G and broadband). The data we capture include network-related statistics, playback statistics (session- and player-event-level), and characteristics of the encoded streams. Such data should enable a broad level of possible applications and uses in the research community: from analysis of the effectiveness of algorithms in streaming players to studies of QoE metrics, and end-to-end system optimizations. Examples of such possible studies based on the proposed datasets are also provided.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131875422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"QPlane: An Open-Source Reinforcement Learning Toolkit for Autonomous Fixed Wing Aircraft Simulation","authors":"David J. Richter, R. A. Calix","doi":"10.1145/3458305.3478446","DOIUrl":"https://doi.org/10.1145/3458305.3478446","url":null,"abstract":"Reinforcement Learning (RL) is a fast-growing field of research that is mostly applied in the realm of video games due to the compatibility of RL and game tasks. AI Gym has established itself as the gold standard toolkit for Reinforcement Learning research. Unfortunately, toolkits like AI Gym are very optimized for benchmark purposes and may not always be suitable for real world type problems. Additionally, fixed wing flight simulation has specific requirements and may need other solutions. In this paper, we propose QPlane as an alternative toolkit for RL training of fixed wing aircraft. QPlane was developed in an effort to create a RL toolkit for fixed wing aircraft simulation that is easily modifiable for different scenarios. QPlane is replicable and flexible for ease of implementation to high performance computing, and is modular for quick environment and algorithm replacement. In this paper we present and discuss details of QPlane, as well as proof of concept results.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133823750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Policy-driven Dynamic HTTP Adaptive Streaming Player Environment","authors":"Minh Nguyen","doi":"10.1145/3458305.3478466","DOIUrl":"https://doi.org/10.1145/3458305.3478466","url":null,"abstract":"Video streaming services account for the majority of today's traffic on the Internet. Although the data transmission rate has been increasing significantly, the growing number and variety of media and higher quality expectations of users have led networked media applications to fully or even over-utilize the available throughput. HTTP Adaptive Streaming (HAS) has become a predominant technique for multimedia delivery over the Internet today. However, there are critical challenges for multimedia systems, especially the tradeoff between the increasing content (complexity) and various requirements regarding time (latency) and quality (QoE). This thesis will cover the main aspects within the end user's environment, including video consumption and interactivity, collectively referred to as player environment, which is probably the most crucial component in today's multimedia applications and services. We will investigate the methods that can enable the specification of various policies reflecting the user's needs in given use cases. Besides, we will also work on schemes that allow efficient support for server-assisted, and network-assisted HAS systems. Finally, those approaches will be considered to combine into policies that fit the requirements of all use cases (e.g., live streaming, video on demand, etc.).","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125050750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacob Chakareski, Ridvan Aksu, Viswanathan Swaminathan, M. Zink
{"title":"Full UHD 360-Degree Video Dataset and Modeling of Rate-Distortion Characteristics and Head Movement Navigation","authors":"Jacob Chakareski, Ridvan Aksu, Viswanathan Swaminathan, M. Zink","doi":"10.1145/3458305.3478447","DOIUrl":"https://doi.org/10.1145/3458305.3478447","url":null,"abstract":"We investigate the rate-distortion (R-D) characteristics of full ultra-high definition (UHD) 360° videos and capture corresponding head movement navigation data of virtual reality (VR) headsets. We use the navigation data to analyze how users explore the 360° look-around panorama for such content and formulate related statistical models. The developed R-D characteristics and modeling capture the spatiotemporal encoding efficiency of the content at multiple scales and can be exploited to enable higher operational efficiency in key use cases. The high quality expectations for next generation immersive media necessitate the understanding of these intrinsic navigation and content characteristics of full UHD 360° videos.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129859809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Stengel, Z. Majercik, Ben Boudaoud, M. McGuire
{"title":"A distributed, decoupled system for losslessly streaming dynamic light probes to thin clients","authors":"Michael Stengel, Z. Majercik, Ben Boudaoud, M. McGuire","doi":"10.1145/3458305.3463379","DOIUrl":"https://doi.org/10.1145/3458305.3463379","url":null,"abstract":"We present a networked, high-performance graphics system that combines dynamic, high-quality, ray traced global illumination computed on a server with direct illumination and primary visibility computed on a client. This approach provides many of the image quality benefits of real-time ray tracing on low-power and legacy hardware, while maintaining a low latency response and mobile form factor. As opposed to streaming full frames from rendering servers to end clients, our system distributes the graphics pipeline over a network by computing diffuse global illumination on a remote machine. Diffuse global illumination is computed using a recent irradiance volume representation combined with a new lossless, HEVC-based, hardware-accelerated encoding, and a perceptually-motivated update scheme. Our experimental implementation streams thousands of irradiance probes per second and requires less than 50 Mbps of throughput, reducing the consumed bandwidth by 99.4% when streaming at 60 Hz compared to traditional lossless texture compression. The bandwidth reduction achieved with our approach allows higher quality and lower latency graphics than state-of-the-art remote rendering via video streaming. In addition, our split-rendering solution decouples remote computation from local rendering and so does not limit local display update rate or display resolution.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134359688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amgad Ahmed, Suhong Kim, Mohamed A. Elgharib, M. Hefeeda
{"title":"User-assisted video reflection removal","authors":"Amgad Ahmed, Suhong Kim, Mohamed A. Elgharib, M. Hefeeda","doi":"10.1145/3458305.3459597","DOIUrl":"https://doi.org/10.1145/3458305.3459597","url":null,"abstract":"Reflections in videos are obstructions that often occur when videos are taken behind reflective surfaces like glass. These reflections reduce the quality of such videos, lead to information loss and degrade the accuracy of many computer vision algorithms. A video containing reflections is a combination of background and reflection layers. Thus, reflection removal is equivalent to decomposing the video into two layers. This, however, is a challenging and ill-posed problem as there is an infinite number of valid decompositions. To address this problem, we propose a user-assisted method for video reflection removal. We rely on both spatial and temporal information and utilize sparse user hints to help improve separation. The proposed method removes complex reflections in videos by including the user in the loop. The method is flexible and can accept various levels of user annotations, within each frame and in the number of frames being annotated. The user provides some strokes in some of the frames in the video, and our method propagates these strokes within the frame using a random walk computation as well as across frames using a point-based motion tracking method. We implement and evaluate the proposed method through quantitative and qualitative results on real and synthetic videos. Our experiments show that the proposed method successfully removes reflection from video sequences, does not introduce visual distortions, and significantly outperforms the state-of-the-art reflection removal methods in the literature.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123235991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. England, Henrique S. Malvar, E. Horvitz, J. W. Stokes, C. Fournet, A. Chamayou, S. Clebsch, Manuel Costa, S. Erfani, K. Kane, A. Shamis
{"title":"AMP: authentication of media via provenance","authors":"P. England, Henrique S. Malvar, E. Horvitz, J. W. Stokes, C. Fournet, A. Chamayou, S. Clebsch, Manuel Costa, S. Erfani, K. Kane, A. Shamis","doi":"10.1145/3458305.3459599","DOIUrl":"https://doi.org/10.1145/3458305.3459599","url":null,"abstract":"Advances in graphics and machine learning have led to the general availability of easy-to-use tools for modifying and synthesizing media. The proliferation of these tools threatens to cast doubt on the veracity of all media. One approach to thwarting the flow of fake media is to detect modified or synthesized media through machine learning methods. While detection may help in the short term, we believe that it is destined to fail as the quality of fake media generation continues to improve. Soon, neither humans nor algorithms will be able to reliably distinguish fake versus real content. Thus, pipelines for assuring the source and integrity of media will be required---and increasingly relied upon. We present AMP, a system that ensures the authentication of media via certifying provenance. AMP creates one or more publisher-signed manifests for a media instance uploaded by a content provider. These manifests are stored in a database allowing fast lookup from applications such as browsers. For reference, the manifests are also registered and signed by a permissioned ledger, implemented using the Confidential Consortium Framework (CCF). CCF employs both software and hardware techniques to ensure the integrity and transparency of all registered manifests. AMP, through its use of CCF, enables a consortium of media providers to govern the service while making all its operations auditable. The authenticity of the media can be communicated to the user via visual elements in the browser, indicating that an AMP manifest has been successfully located and verified.","PeriodicalId":138399,"journal":{"name":"Proceedings of the 12th ACM Multimedia Systems Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124021991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}