{"title":"Learned Image Coding for Human-Machine Collaborative Optimization","authors":"Jingbo He;Xiaohai He;Shuhua Xiong;Honggang Chen","doi":"10.1109/TBC.2024.3443470","DOIUrl":"10.1109/TBC.2024.3443470","url":null,"abstract":"The exponential growth in the volume of image data has imposed immense pressure on transmission and storage systems, while simultaneously presenting opportunities for intelligent image analysis towards machine vision. Recent years, learned image coding approach have made remarkable advancements with impressive performance. The application of the learned image coding method in machine vision holds promising prospects for achieving human-machine collaboration. In this paper, we propose a learned image coding approach based on Transformer-CNN interaction structure for human-machine vision collaborative optimization, which can generate a single and compact bitstream for efficient representation in image compression. The bitstream can be directly decoded to generate a reconstructed image for human visual perception. In parallel, without the need for decoding and reconstructing the image, the bitstream can serve as input for machine vision tasks. This not only reduces computational costs on the decoding end but also enhances machine analysis efficiency. Experimental results demonstrate that our proposed learned image coding method achieves a single bitstream that concurrently considers image reconstruction and machine task analysis, ensuring high accuracy in machine tasks and superior quality in reconstructed images compared to state-of-the-art (SOTA) methods.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"203-216"},"PeriodicalIF":3.2,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142207655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mobility-Enabled Dynamic Grouping for Multicast Broadcast Service","authors":"Kuang-Hsun Lin;Ting-Wei Chen;Hung-Yu Wei","doi":"10.1109/TBC.2024.3443469","DOIUrl":"https://doi.org/10.1109/TBC.2024.3443469","url":null,"abstract":"3GPP has established the Multicast Broadcast Services (MBS) standard to accommodate the escalating bandwidth demands of emerging applications like mixed reality and online gaming. MBS offers an efficient means of simultaneously delivering content to different users through the same wireless resources. However, the efficacy of grouping is intricately linked to user mobility and the channel quality of the weakest link. Notably, it is identified that handovers can cause significant interruptions in MBS transmissions. To address this, our paper introduces a novel dynamic grouping scheme capable of adapting to user mobility. Our results demonstrate superior performance compared to state-of-the-art methods without introducing much signaling overhead associated with MBS group management.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1167-1180"},"PeriodicalIF":3.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vasileios P. Rekkas;Sotirios P. Sotiroudis;Lazaros Alexios Iliadis;Sander Bastiaens;Wout Joseph;David Plets;Christos G. Christodoulou;George K. Karagiannidis;Sotirios K. Goudos
{"title":"Enhancing 3D Indoor Visible Light Positioning With Machine Learning Combined Nyström Kernel Approximation","authors":"Vasileios P. Rekkas;Sotirios P. Sotiroudis;Lazaros Alexios Iliadis;Sander Bastiaens;Wout Joseph;David Plets;Christos G. Christodoulou;George K. Karagiannidis;Sotirios K. Goudos","doi":"10.1109/TBC.2024.3437216","DOIUrl":"10.1109/TBC.2024.3437216","url":null,"abstract":"Optical wireless communication (OWC) is emerging as a pivotal technology for next-generation broadcast networks, with visible light communication (VLC) poised to meet the escalating demands of advanced radio frequency systems. This study focuses on enhancing visible light positioning (VLP), recognized for its precision, simplicity, and cost-effectiveness, which are essential for accurate indoor localization and responsive location-based services. Central to our approach is the integration of advanced machine learning (ML) techniques, which fundamentally transform the accuracy and efficiency of 3D indoor positioning systems. We introduce an advanced VLP framework where ML is leveraged not merely as an adjunct but as the primary driver of innovation, significantly refining the processing of received signal strength (RSS) indicators. The methodology centers around a system comprising four light-emitting diodes (LEDs) arranged in a star geometry, optimized for precise spatial localization. We evaluate three distinct methodologies: a foundational star-shaped configuration for baseline position estimation, a repeated unit cell strategy to extend the four-LED configuration to a larger positioning area, and a sophisticated implementation employing Nyström kernel approximation. This integration of Nyström approximation into our ML framework drastically enhances the system’s predictive accuracy, achieving an exceptional average relative root mean square error (aRRMSE) of 2.1 cm in a simulated setup. The results demonstrate that ML, especially combined with the application of the Nyström kernel approximation, significantly elevates the precision and operational efficiency of traditional VLP systems, setting new benchmarks for accuracy in indoor 3D positioning technologies and fostering advancements towards more sophisticated and adaptable communication networks.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1192-1206"},"PeriodicalIF":3.2,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141940834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temporal Adaptive Learned Surveillance Video Compression","authors":"Yu Zhao;Mao Ye;Luping Ji;Hongwei Guo;Ce Zhu","doi":"10.1109/TBC.2024.3434736","DOIUrl":"10.1109/TBC.2024.3434736","url":null,"abstract":"As the amount of surveillance video data increases at an exponential rate, the need for efficient video compression algorithms becomes increasingly urgent. The inter-frame compression schemes of existing surveillance video compression methods predict the current frame through the previous frame, causing the error to gradually increase because the quality of the reference frame decreases progressively. In this paper, we propose a Temporal Adaptive enhancement method for Learned surveillance video Compression (TALC). The proposed TALC has two modules: Forward Temporal Adaptive (FTA) module and Backward Temporal Adaptive (BTA) module which are put before and after motion and residual bits transmission modules respectively. These two modules have the same network structure which consists of a Temporal Adaptive Selection (TAS) block and a Feature Enhancement (FE) block. TAS block can analyze the extent which errors accumulate in optical flow and residuals, then select the corresponding enhancement sub-block; while FE block consists of several enhancement sub-blocks according to different levels of error accumulation. The proposed TALC has strong versatility and low coupling, which can be applied in almost all learned video compression frameworks as a plugin. Experimental results show that the proposed TALC method can significantly improve the coding performance of learned surveillance video compression networks without changing the original basic structure.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"142-153"},"PeriodicalIF":3.2,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141969069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Chang;Renlong Han;Chengye Jiang;Guichen Yang;Qianqian Zhang;Junsen Wang;Falin Liu
{"title":"Dual Feature Indexed Quadratic Polynomial-Based Piecewise Behavioral Model for Digital Predistortion of RF Power Amplifiers","authors":"Hao Chang;Renlong Han;Chengye Jiang;Guichen Yang;Qianqian Zhang;Junsen Wang;Falin Liu","doi":"10.1109/TBC.2024.3434625","DOIUrl":"10.1109/TBC.2024.3434625","url":null,"abstract":"This paper proposes a dual feature indexed quadratic polynomial-based piecewise (DIQP) behavioral modeling technique for digital predistortion (DPD) of RF transmitters. The proposed DIQP model is used to find the most suitable DPD model by performing a dual feature classification on the optimized submodels with a reuse-based function screening algorithm. The optimized submodel is adapted from the previous instantaneous sample indexed magnitude-selective affine (I-MSA) function-based model by transforming the original single linear term into a quadratic term with stronger fitting ability. This key improvement not only enhances the flexibility of the model but also boosts its fitting capability. The segmentation rule of the piecewise model has evolved from a simple threshold segmentation to a dual feature segmentation based on threshold and clustering segments. This reconstruction provides the model with enhanced feature-building capabilities. Additionally, the corresponding hybrid basis function screening (HBFS) algorithm and running complexity identification algorithm based on basis function reuse are proposed. The ingenious design of this reuse-based function screening algorithm not only enhances running efficiency but also ensures the overall performance of the model. The experimental part uses two different power amplifiers (PAs) for behavioral modeling and linearization tests. And the results of the experiments prove that the screened DIQP model is able to achieve the linearization performance-complexity trade-off excellently.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1302-1315"},"PeriodicalIF":3.2,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141940836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinxin Zuo;Ziping Wang;Chenqing Guo;Weixuan Xie;Hao Wu;Peng Yu;Yueming Lu
{"title":"A Decentralized Reputation Management Model for Enhanced IoV Networks With 5G Broadcast Services","authors":"Jinxin Zuo;Ziping Wang;Chenqing Guo;Weixuan Xie;Hao Wu;Peng Yu;Yueming Lu","doi":"10.1109/TBC.2024.3434745","DOIUrl":"10.1109/TBC.2024.3434745","url":null,"abstract":"This paper investigates the challenges of data trust sharing faced by Internet of Vehicles (IoV) network with 5G broadcast services. Particularly we develop a decentralized IoV reputation management model with spatiotemporal feature perception fusion (RMM-STFP) based on blockchain. The proposed reputation evaluation method evaluates the reputation value of nodes from the two aspects of time continuity and spatial transitivity and thus improves the identification accuracy of malicious nodes. To further accelerate the dissemination of reputation data, we have constructed a blockchain-based management storage system, where PBFT consensus scheme combines reputation and Bayesian inference. Finally, numerical results are given to justify the superiority of our proposed scheme. When proportion of malicious nodes reaches 45%, the accuracy of our proposed method is 94.5%, and the suppression rate of malicious messages is 83%. Moreover, compared with the traditional PBFT consensus scheme, the consensus delay and communication overhead are reduced by 87.57% and 78.45%, respectively, and the transaction throughput is increased by 70.65%.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"63-73"},"PeriodicalIF":3.2,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141940841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MAFBLiF: Multi-Scale Attention Feature Fusion-Based Blind Light Field Image Quality Assessment","authors":"Rui Zhou;Gangyi Jiang;Yueli Cui;Yeyao Chen;Haiyong Xu;Ting Luo;Mei Yu","doi":"10.1109/TBC.2024.3434699","DOIUrl":"10.1109/TBC.2024.3434699","url":null,"abstract":"Light field imaging captures both the intensity and directional information of light rays, providing users with more immersive visual experience. However, during the processes of imaging, processing, coding and reconstruction, light field images (LFIs) may encounter various distortions that degrade their visual quality. Compared to two-dimensional image quality assessment, light field image quality assessment (LFIQA) needs to consider not only the image quality in the spatial domain but also the quality degradation in the angular domain. To effectively model the factors related to visual perception and LFI quality, this paper proposes a multi-scale attention feature fusion based blind LFIQA metric, named MAFBLiF. The proposed metric consists of the following parts: MLI-Patch generation, spatial-angular feature separation module, spatial-angular feature extraction backbone network, pyramid feature alignment module and patch attention module. These modules are specifically designed to extract spatial and angular information of LFIs, and capture multi-level information and regions of interest. Furthermore, a pooling scheme guided by the LFI’s gradient information and saliency is proposed, which integrates the quality of all MLI-patches into the overall quality of the input LFI. Finally, to demonstrate the effectiveness of the proposed metric, extensive experiments are conducted on three representative LFI quality evaluation datasets. The experimental results show that the proposed metric outperforms other state-of-the-art image quality assessment metrics. The code will be publicly available at \u0000<uri>https://github.com/oldblackfish/MAFBLiF</uri>\u0000.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1266-1278"},"PeriodicalIF":3.2,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141940842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object Segmentation-Assisted Inter Prediction for Versatile Video Coding","authors":"Zhuoyuan Li;Zikun Yuan;Li Li;Dong Liu;Xiaohu Tang;Feng Wu","doi":"10.1109/TBC.2024.3434520","DOIUrl":"10.1109/TBC.2024.3434520","url":null,"abstract":"In modern video coding standards, block-based inter prediction is widely adopted, which brings high compression efficiency. However, in natural videos, there are usually multiple moving objects of arbitrary shapes, resulting in complex motion fields that are difficult to represent compactly. This problem has been tackled by more flexible block partitioning methods in the Versatile Video Coding (VVC) standard, but the more flexible partitions require more overhead bits to signal and still cannot be made arbitrarily shaped. To address this limitation, we propose an object segmentation-assisted inter prediction method (SAIP), where objects in the reference frames are segmented by some advanced technologies. With a proper indication, the object segmentation mask is translated from the reference frame to the current frame as the arbitrary-shaped partition of different regions without any extra signal. Using the segmentation mask, motion compensation is separately performed for different regions, achieving higher prediction accuracy. The segmentation mask is further used to code the motion vectors of different regions more efficiently. Moreover, the segmentation mask is considered in the joint rate-distortion optimization for motion estimation and partition estimation to derive the motion vector of different regions and partition more accurately. The proposed method is implemented into the VVC reference software, VTM version 12.0. Experimental results show that the proposed method achieves up to 1.98%, 1.14%, 0.79%, and on average 0.82%, 0.49%, 0.37% BD-rate reduction for common test sequences, under the Low-delay P, Low-delay B, and Random Access configurations, respectively.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 4","pages":"1236-1253"},"PeriodicalIF":3.2,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141940835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wen-Xuan Long;Nian Li;Yuan Liu;M. R. Bhavani Shankar;Rui Chen
{"title":"Low-Overhead Iterative Channel Parameter Estimation for Multi-User OAM Wireless Backhaul","authors":"Wen-Xuan Long;Nian Li;Yuan Liu;M. R. Bhavani Shankar;Rui Chen","doi":"10.1109/TBC.2024.3434676","DOIUrl":"10.1109/TBC.2024.3434676","url":null,"abstract":"This paper considers the issue of acquiring channel state information (CSI) for multi-user orbital angular momentum (MU-OAM) wireless backhaul between the macro base station (MBS) and small base stations (SBSs) within broadcasting networks. Unlike prior works, we assume that each SBS transmits a pilot signal of length one on each multiplexed OAM mode and subcarrier, resulting in the coherent observations collected at the MBS. Then, we construct the data sets using the coherent observations, the components of which independently contain arbitrarily assumed positional information. The amplitude-phase multiple signal classification (AP-MUSIC) algorithm, a novel variant of the MUSIC, then conducts a two-dimensional (2-D) search on the amplitude and phase of the data component in both the OAM mode and frequency domains for estimating positions at each iteration. These estimates, together with the observations, are used to iteratively update the data sets, ultimately refining the distances and AoAs of all SBSs. The theoretical analysis and simulation results indicate that this solution not only yields the precise CSI for the MU-OAM system, but also markedly reduces the training overhead, compared to existing alternatives.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 1","pages":"74-80"},"PeriodicalIF":3.2,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10620284","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141886385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jian Wang;Yulong Hao;Zhongle Wu;Yafei Shi;Cheng Yang
{"title":"A Broadcast Map Constructing Method Based on the LSTM and Assimilation Theory","authors":"Jian Wang;Yulong Hao;Zhongle Wu;Yafei Shi;Cheng Yang","doi":"10.1109/TBC.2024.3434536","DOIUrl":"10.1109/TBC.2024.3434536","url":null,"abstract":"Frequency modulation (FM) broadcasting is a robust and widely applied technology that offers unparalleled advantages over other broadcasting methods in challenging environments. In order to achieve high accuracy in constructing broadcasting maps for scenarios with uneven and sparse distribution of measurement data, we introduce the concept of FM broadcasting maps and propose a novel methodology for their construction. This paper utilizes the Long Short-Term Memory (LSTM) model to assimilate predictions from the ITU-R models for modeling purposes. To begin, we analyzed critical environmental parameters influencing radio wave propagation. Based on this analysis, we identified the foundational input features for the LSTM model. Subsequently, predictions from the ITU-R P.1546 and 2001 models were assimilated as features and input into the LSTM model for training, resulting in assimilation modeling. Finally, a broadcast map is constructed using the parameter construction method based on the proposed model. The results indicate that the relative error between the measurements and the proposed models, ITU-R P.1546 and ITU-R P.2001, are 3.14%, 6.48%, and 9.89%, respectively. The prediction accuracy of the proposed model surpasses that of the ITU-R models, and stability is significantly improved compared to models solely based on LSTM. The broadcast map in this paper provides an objective reflection of measured field strength values across multiple dimensions, including frequency, distance, various terrains, and error distribution. It demonstrates notable advantages in scenarios characterized by sparse and unevenly distributed sampling points.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 3","pages":"924-934"},"PeriodicalIF":3.2,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141869492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}