IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3277851
Benjamin W. Wah, Jingxi X. Xu
{"title":"Optimizing Multidimensional Perceptual Quality in Online Interactive Multimedia","authors":"Benjamin W. Wah, Jingxi X. Xu","doi":"10.1109/MMUL.2023.3277851","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3277851","url":null,"abstract":"Network latencies and losses in online interactive multimedia applications may lead to a degraded perception of quality, such as lower interactivity or sluggish responses. We can measure these degradations in perceptual quality by the just-noticeable difference, awareness, or probability of noticeability ($p_{text{note}}$pnote); the latter measures the likelihood that subjects can notice a change from a reference to a modified reference. In our previous work, we developed an efficient method for finding the perceptual quality for one metric under simplex control. However, integrating the perceptual qualities of several metrics is a heuristic. In this article, we present a formal approach to optimally combine the perceptual quality of multiple metrics into a joint measure that shows their tradeoffs. Our result shows that the optimal balance occurs when the $p_{text{note}}$pnote of all the component metrics are equal. Furthermore, our approach leads to an algorithm with a linear (instead of combinatorial) complexity of the number of metrics. Finally, we present the application of our method in two case studies, one on VoIP for finding the optimal operating points and the second on fast-action games to hide network delays while maintaining the consistency of action orders.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"119-128"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47142163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2022.3224874
K. Namitha, M. Geetha, N.Rev athi
{"title":"An Improved Interaction Estimation and Optimization Method for Surveillance Video Synopsis","authors":"K. Namitha, M. Geetha, N.Rev athi","doi":"10.1109/MMUL.2022.3224874","DOIUrl":"https://doi.org/10.1109/MMUL.2022.3224874","url":null,"abstract":"Videos synopsis is an efficient technique for condensing long-duration videos into short videos. The interactions between moving objects in the original video need to be preserved during video condensation. However, identifying objects with strong spatio-temporal proximity from a monocular video frame is a challenge. Further, the process of tube rearrangement optimization is also vital for the reduction of collision rates among moving objects. Taking the aforementioned aspects into consideration, we present a comprehensive video synopsis framework. First, we propose an interaction detection method to estimate distortion less spatio-temporal interactions between moving objects by generating the top view of a scene using a perspective transformation. Second, we propose an optimization method to reduce collisions and preserve object interactions by shrinking the search space. The experimental results demonstrate that the proposed framework provides a better estimate for object interactions from surveillance videos and generates synopsis videos with fewer collisions while preserving original interactions.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"25-36"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45610645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3308401
Balakrishnan Prabhakaran
{"title":"Taking a “Deep” Look at Multimedia Streaming","authors":"Balakrishnan Prabhakaran","doi":"10.1109/mmul.2023.3308401","DOIUrl":"https://doi.org/10.1109/mmul.2023.3308401","url":null,"abstract":"Streaming multimedia content has become an integral part of our lives influencing the way we consume daily news, communicate with friends, family and in office, and entertain ourselves. Quality of multimedia content has been improving by leaps and bounds with advances in camera and other sensing technologies. In parallel, advances in multimedia display technologies have been equally amazing providing vast choice of affordable high-definition devices of a wide range of sizes. Quality of service (QoS) offered by Internet service providers has experienced impressive growth as well. All these factors have led to a huge surge on multimedia streaming sessions that need to be supported on the Internet. Advances in deep machine learning (ML) techniques have been successfully leveraged to manage the unprecedented usage of multimedia streaming. However, as the various factors influencing multimedia streaming continue to evolve, continuous research is needed to adopt new deep learning techniques for efficient multimedia streaming.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3309015
{"title":"IEEE Annals of the History of Computing","authors":"","doi":"10.1109/mmul.2023.3309015","DOIUrl":"https://doi.org/10.1109/mmul.2023.3309015","url":null,"abstract":"","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3239136
Yufei Zha, Fan Li, Huanyu Li, Peng Zhang, Wei Huang
{"title":"Reversible Modal Conversion Model for Thermal Infrared Tracking","authors":"Yufei Zha, Fan Li, Huanyu Li, Peng Zhang, Wei Huang","doi":"10.1109/MMUL.2023.3239136","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3239136","url":null,"abstract":"Learning powerful CNN representation of the target is a key issue for thermal infrared (TIR) tracking. The lack of massive training TIR data is one of the obstacles to training the network in an end-to-end way from the scratch. Compared to the time-consuming and labor-intensive method of heavily relabeling data, we obtain trainable TIR images by leveraging the massive annotated RGB images in this article. Unlike the traditional image generation models, a modal reversible module is designed to maximize the information propagation between RGB and TIR modals in this work. The advantage is that this module can preserve the modal information as possible when the network is conducted on a large number of aligned RGBT image pairs. Additionally, the fake-TIR features generated by the proposed module are also integrated to enhance the target representation ability when TIR tracking is on-the-fly. To verify the proposed method, we conduct sufficient experiments on both single-modal TIR and multimodal RGBT tracking datasets. In single-modal TIR tracking, the performance of our method is improved by 2.8% and 0.94% on success rate compared with the SOTA on LSOTB-TIR and PTB-TIR dataset. In multimodal RGBT fusion tracking, the proposed method is tested on the RGBT234 and VOT-RGBT2020 datasets and the results have also reached the performance of SOTA.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"8-24"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48643610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3311108
{"title":"Computing in Science & Engineering","authors":"","doi":"10.1109/mmul.2023.3311108","DOIUrl":"https://doi.org/10.1109/mmul.2023.3311108","url":null,"abstract":"","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3269459
Wei Gao, Hang Yuan, Guibiao Liao, Zixuan Guo, Jianing Chen
{"title":"PP8K: A New Dataset for 8K UHD Video Compression and Processing","authors":"Wei Gao, Hang Yuan, Guibiao Liao, Zixuan Guo, Jianing Chen","doi":"10.1109/MMUL.2023.3269459","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3269459","url":null,"abstract":"In the new era of ultra-high definition (UHD) videos, 8K is becoming more popular in diversified applications to boost the human visual experience and the performances of related vision tasks. However, researchers still suffer from the lack of 8K video sources to develop better processing algorithms for the compression, saliency detection, quality assessment, and vision analysis tasks. To ameliorate this situation, we construct a new comprehensive 8K UHD video dataset, which has two sub-datasets, i.e., the common raw format videos (CRFV) dataset and the video salient object detection (VSOD) dataset. To fully validate the diversity and practicality, the spatial and temporal information characteristics of the CRFV dataset are evaluated by the widely used metrics and the video encoder. Through the extensive experiments and comparative analyses with the other counterpart datasets, the proposed 8K dataset shows apparent advantages in diversity and practicality, which can benefit its applications for the developments of UHD video technologies. This dataset has been released online: https://git.openi.org.cn/OpenDatasets/PP8K.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"100-109"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46971638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}