IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3270035
Peilin Chen, Wenhan Yang, Shiqi Wang
{"title":"Reviving Standard-Dynamic-Range Videos for High-Dynamic-Range Devices: A Learning Paradigm With Hybrid Attention Mechanisms","authors":"Peilin Chen, Wenhan Yang, Shiqi Wang","doi":"10.1109/MMUL.2023.3270035","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3270035","url":null,"abstract":"With the prevalence of high-dynamic-range (HDR) display devices, the demand to convert existing standard-dynamic-range television (SDRTV) video content to its corresponding HDR television (HDRTV) counterpart is growing exponentially. Herein, we propose a two-stage learning paradigm with hybrid attention mechanisms to fully exploit spatial, channelwise, and regional correlations for faithfully driving such conversion. Specifically, in the first domain-mapping stage, the depthwise self-attention and global calibration layer are proposed, which adaptively leverage feature intrarelationships to construct better scene representation and achieve engaging SDRTV-to-HDRTV transformation. In the second highlight-generation stage, considering that the overexposed regions potentially lead to detail loss, which brings enormous challenges to the conversion, we propose a regional self-attention module to specifically restore missing highlights. Extensive experimental results on public databases show that our method outperforms state-of-the-art approaches in terms of different quality evaluation measures.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"110-118"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43430333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3308997
{"title":"Drive Diversity & Inclusion in Computing","authors":"","doi":"10.1109/mmul.2023.3308997","DOIUrl":"https://doi.org/10.1109/mmul.2023.3308997","url":null,"abstract":"","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3247522
Kouros Zanbouri, H. M. Al-Khafaji, N. J. Navimipour, Senay Yalçin
{"title":"A New Fog-Based Transmission Scheduler on the Internet of Multimedia Things Using a Fuzzy-Based Quantum Genetic Algorithm","authors":"Kouros Zanbouri, H. M. Al-Khafaji, N. J. Navimipour, Senay Yalçin","doi":"10.1109/MMUL.2023.3247522","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3247522","url":null,"abstract":"The Internet of Multimedia Things (IoMT) has recently experienced a considerable surge in multimedia-based services. Due to the fast proliferation and transfer of massive data, the IoMT has service quality challenges. This article proposes a novel fog-based multimedia transmission scheme for the IoMT using the Sugano interference system with a quantum genetic optimization algorithm. The fuzzy system devises a mathematically organized strategy for generating fuzzy rules from input and output variables. The quantum genetic algorithm (QGA) is a metaheuristic algorithm that combines genetic algorithms and quantum computing theory. It combines many critical elements of quantum computing, such as quantum superposition and entanglement. This provides a robust representation of population diversity and the capacity to achieve rapid convergence and high accuracy. As a result of the simulations and computational analysis, the proposed fuzzy-based QGA scheme improves the packet delivery ratio and throughput by reducing end-to-end latency and delay when compared to traditional algorithms like genetic algorithm, particle swarm optimization, heterogeneous earliest finish time, and ant colony optimization. Consequently, it provides a more efficient scheme for multimedia transmission in the IoMT.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"74-86"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49172419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/MMUL.2023.3242455
R. Tang, Cheng Yang, Yuxuan Wang
{"title":"A Cross-Domain Multimodal Supervised Latent Topic Model for Item Tagging and Cold-Start Recommendation","authors":"R. Tang, Cheng Yang, Yuxuan Wang","doi":"10.1109/MMUL.2023.3242455","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3242455","url":null,"abstract":"Cross-domain data analysis is playing an increasingly important role in media convergence and can be adopted for many applications. Most existing methods consider the domain discrimination as the multimodal representation difference or the imbalanced item classification distribution, ignoring the different tag semantics among domains. To this end, we propose an explainable cross-domain multimodal supervised latent topic (CDMSLT) model and evaluate our model on two applications. First, we learn a common topic space that is capable of explaining both domain specification and commonality. Second, we apply our model to a multilabel classification task and put forward a cross-domain item tagging method. Third, combining user behaviors and the CDMSLT model, we propose a cross-domain recommendation algorithm that could estimate the user preference on new unseen domains. This article proves the effectiveness of the CDMSLT model by comparing these two applications with existing algorithms in a cross-domain scenario on the Douban dataset.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"48-62"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44449320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perceptual Authentication Hashing for Digital Images With Contrastive Unsupervised Learning","authors":"Guopeng Gao, Chuan Qin, Yaodong Fang, Yuanding Zhou","doi":"10.1109/MMUL.2023.3280669","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3280669","url":null,"abstract":"In recent years, many perceptual image hashing schemes for content authentication have been proposed based on classical methods and deep learning. However, most existing schemes target specific and limited content-preserving manipulations and cannot provide satisfactory robustness to unknown manipulations. In this work, we propose a new perceptual authentication hashing model for digital images based on contrastive unsupervised learning. In detail, a contrastive augmentation structure is exploited, which can optimize the model through changing the types and strengths of sample augmentation. Also, an integrated loss function is designed by the weighted summing of two components, i.e., the contrastive loss and hash loss, which can help the model learn perceptual feature representation with an unlabeled dataset and effectively improve the robustness and discrimination. Experimental results show that the proposed scheme can achieve superior performance compared with some state-of-the-art schemes, especially robustness to unknown attacks.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"129-140"},"PeriodicalIF":3.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45130549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-07-01DOI: 10.1109/mmul.2023.3309014
{"title":"IEEE Computer Society Has You Covered!","authors":"","doi":"10.1109/mmul.2023.3309014","DOIUrl":"https://doi.org/10.1109/mmul.2023.3309014","url":null,"abstract":"","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135852557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IEEE MultiMediaPub Date : 2023-04-01DOI: 10.1109/MMUL.2023.3263943
Irene Viola, Jack Jansen, S. Subramanyam, Ignacio Reimat, Pablo César
{"title":"VR2Gather: A Collaborative, Social Virtual Reality System for Adaptive, Multiparty Real-Time Communication","authors":"Irene Viola, Jack Jansen, S. Subramanyam, Ignacio Reimat, Pablo César","doi":"10.1109/MMUL.2023.3263943","DOIUrl":"https://doi.org/10.1109/MMUL.2023.3263943","url":null,"abstract":"Virtual reality telecommunication systems promise to overcome the limitations of current real-time teleconferencing solutions by enabling a better sense of immersion and fostering more natural interpersonal interactions. Many solutions that currently enable immersive teleconferencing employ synthetic avatars to represent their users. However, photorealistic reconstructions have been shown to increase the sense of presence with respect to synthetic avatars in teleimmersive scenarios. In this article, we present VR2Gather, a costumizable, end-to-end system to transmit volumetric contents in multiparty, real-time communication. We present the architecture and evaluate the costs and benefits of using different modules and transport mechanisms in terms of CPU usage, latency, and bandwidth. Moreover, we report the user experience based on applications the system has been used for and how it was customized to meet the requirements using different acquisition and rendering modules.","PeriodicalId":13240,"journal":{"name":"IEEE MultiMedia","volume":"30 1","pages":"48-59"},"PeriodicalIF":3.2,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48546908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}