{"title":"Bandwidth-Aware High-Efficiency Video Coding Design Scheme on a Multiprocessor System on Chip","authors":"Jui-Hung Hsieh, Zhi-Yu Zhang, Jing-Cheng Syu, Mao-Cheng Hsieh","doi":"10.1109/MMUL.2023.3253521","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3253521","url":null,"abstract":"H.265/high-efficiency video coding (HEVC) provides a multitude of video data compression to minimize data storage and data transmission while preserving video coding quality and ameliorating coding bit rates. However, HEVC encoder chips are frequently integrated into mobile multiprocessor system-on-chip (MPSoC) systems that adopt intelligent thermal and power management techniques for heat- and power-dissipation reductions. Consequently, the accessible coding bandwidth (CB) for HEVC encoder chip use is not fixed, and the compressed video data for data transmission within MPSoCs are restricted to time-altering wireless transmission bandwidths (TBs). Therefore, the proposed bandwidth-aware H.265/HEVC controller design solves the video coding problems of limited CB and TB by jointly using a machine learning method and convex optimization. The ancillary experimental and implementation results demonstrate that the proposed CB-TB rate-coding distortion algorithm modeling and the very large-scale integration hardware architecture are applicable for CB- and TB-constrained HEVC encoder design within MPSoCs.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"37-47"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46464590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2022.3224874
K. Namitha, M. Geetha, N.Rev athi
{"title":"An Improved Interaction Estimation and Optimization Method for Surveillance Video Synopsis","authors":"K. Namitha, M. Geetha, N.Rev athi","doi":"10.1109/MMUL.2022.3224874","DOIUrl":"https://doi.org/10.1109/MMUL.2022.3224874","url":null,"abstract":"Videos synopsis is an efficient technique for condensing long-duration videos into short videos. The interactions between moving objects in the original video need to be preserved during video condensation. However, identifying objects with strong spatio-temporal proximity from a monocular video frame is a challenge. Further, the process of tube rearrangement optimization is also vital for the reduction of collision rates among moving objects. Taking the aforementioned aspects into consideration, we present a comprehensive video synopsis framework. First, we propose an interaction detection method to estimate distortion less spatio-temporal interactions between moving objects by generating the top view of a scene using a perspective transformation. Second, we propose an optimization method to reduce collisions and preserve object interactions by shrinking the search space. The experimental results demonstrate that the proposed framework provides a better estimate for object interactions from surveillance videos and generates synopsis videos with fewer collisions while preserving original interactions.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"25-36"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45610645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3308401
Balakrishnan Prabhakaran
{"title":"Taking a “Deep” Look at Multimedia Streaming","authors":"Balakrishnan Prabhakaran","doi":"10.1109/mmul.2023.3308401","DOIUrl":"https://doi.org/10.1109/mmul.2023.3308401","url":null,"abstract":"Streaming multimedia content has become an integral part of our lives influencing the way we consume daily news, communicate with friends, family and in office, and entertain ourselves. Quality of multimedia content has been improving by leaps and bounds with advances in camera and other sensing technologies. In parallel, advances in multimedia display technologies have been equally amazing providing vast choice of affordable high-definition devices of a wide range of sizes. Quality of service (QoS) offered by Internet service providers has experienced impressive growth as well. All these factors have led to a huge surge on multimedia streaming sessions that need to be supported on the Internet. Advances in deep machine learning (ML) techniques have been successfully leveraged to manage the unprecedented usage of multimedia streaming. However, as the various factors influencing multimedia streaming continue to evolve, continuous research is needed to adopt new deep learning techniques for efficient multimedia streaming.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3309015
{"title":"IEEE Annals of the History of Computing","authors":"","doi":"10.1109/mmul.2023.3309015","DOIUrl":"https://doi.org/10.1109/mmul.2023.3309015","url":null,"abstract":"","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3239136
Yufei Zha, Fan Li, Huanyu Li, Peng Zhang, Wei Huang
{"title":"Reversible Modal Conversion Model for Thermal Infrared Tracking","authors":"Yufei Zha, Fan Li, Huanyu Li, Peng Zhang, Wei Huang","doi":"10.1109/MMUL.2023.3239136","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3239136","url":null,"abstract":"Learning powerful CNN representation of the target is a key issue for thermal infrared (TIR) tracking. The lack of massive training TIR data is one of the obstacles to training the network in an end-to-end way from the scratch. Compared to the time-consuming and labor-intensive method of heavily relabeling data, we obtain trainable TIR images by leveraging the massive annotated RGB images in this article. Unlike the traditional image generation models, a modal reversible module is designed to maximize the information propagation between RGB and TIR modals in this work. The advantage is that this module can preserve the modal information as possible when the network is conducted on a large number of aligned RGBT image pairs. Additionally, the fake-TIR features generated by the proposed module are also integrated to enhance the target representation ability when TIR tracking is on-the-fly. To verify the proposed method, we conduct sufficient experiments on both single-modal TIR and multimodal RGBT tracking datasets. In single-modal TIR tracking, the performance of our method is improved by 2.8% and 0.94% on success rate compared with the SOTA on LSOTB-TIR and PTB-TIR dataset. In multimodal RGBT fusion tracking, the proposed method is tested on the RGBT234 and VOT-RGBT2020 datasets and the results have also reached the performance of SOTA.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"8-24"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48643610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3311108
{"title":"Computing in Science & Engineering","authors":"","doi":"10.1109/mmul.2023.3311108","DOIUrl":"https://doi.org/10.1109/mmul.2023.3311108","url":null,"abstract":"","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3269459
Wei Gao, Hang Yuan, Guibiao Liao, Zixuan Guo, Jianing Chen
{"title":"PP8K: A New Dataset for 8K UHD Video Compression and Processing","authors":"Wei Gao, Hang Yuan, Guibiao Liao, Zixuan Guo, Jianing Chen","doi":"10.1109/MMUL.2023.3269459","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3269459","url":null,"abstract":"In the new era of ultra-high definition (UHD) videos, 8K is becoming more popular in diversified applications to boost the human visual experience and the performances of related vision tasks. However, researchers still suffer from the lack of 8K video sources to develop better processing algorithms for the compression, saliency detection, quality assessment, and vision analysis tasks. To ameliorate this situation, we construct a new comprehensive 8K UHD video dataset, which has two sub-datasets, i.e., the common raw format videos (CRFV) dataset and the video salient object detection (VSOD) dataset. To fully validate the diversity and practicality, the spatial and temporal information characteristics of the CRFV dataset are evaluated by the widely used metrics and the video encoder. Through the extensive experiments and comparative analyses with the other counterpart datasets, the proposed 8K dataset shows apparent advantages in diversity and practicality, which can benefit its applications for the developments of UHD video technologies. This dataset has been released online: https://git.openi.org.cn/OpenDatasets/PP8K.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"100-109"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46971638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}