{"title":"IEEE Transactions on Broadcasting Information for Authors","authors":"","doi":"10.1109/TBC.2025.3569995","DOIUrl":"https://doi.org/10.1109/TBC.2025.3569995","url":null,"abstract":"","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"C3-C4"},"PeriodicalIF":3.2,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11027898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"STFF: Spatio-Temporal and Frequency Fusion for Video Compression Artifact Removal","authors":"Mingxing Wang;Yipeng Liao;Weiling Chen;Liqun Lin;Tiesong Zhao","doi":"10.1109/TBC.2025.3550018","DOIUrl":"https://doi.org/10.1109/TBC.2025.3550018","url":null,"abstract":"Video compression artifact removal focuses on enhancing the visual quality of compressed videos by mitigating visual distortions. However, existing methods often struggle to effectively capture spatio-temporal features and recover high-frequency details, due to their suboptimal adaptation to the characteristics of compression artifacts. To overcome these limitations, we propose a novel Spatio-Temporal and Frequency Fusion (STFF) framework. STFF incorporates three key components: Feature Extraction and Alignment (FEA), which employs SRU for effective spatiotemporal feature extraction; Bidirectional High-Frequency Enhanced Propagation (BHFEP), which integrates HCAB to restore high-frequency details through bidirectional propagation; and Residual High-Frequency Refinement (RHFR), which further enhances high-frequency information. Extensive experiments demonstrate that STFF achieves superior performance compared to state-of-the-art methods in both objective metrics and subjective visual quality, effectively addressing the challenges posed by video compression artifacts. Trained model available: <uri>https://github.com/Stars-WMX/STFF</uri>.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"542-554"},"PeriodicalIF":3.2,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An End-to-End Spatially Scalable Light Field Image Compression Method","authors":"Jianjun Lei;Hao Li;Bo Peng;Bo Zhao;Nam Ling","doi":"10.1109/TBC.2025.3553295","DOIUrl":"https://doi.org/10.1109/TBC.2025.3553295","url":null,"abstract":"Recently, learning-based light field (LF) image compression methods have achieved impressive progress, while end-to-end spatially scalable LF image compression (SS-LFIC) has not been explored. To tackle this problem, this paper proposes an end-to-end spatially scalable LF compression network (SSLFC-Net). In the SSLFC-Net, a spatial-angular domain-specific enhancement layer coding strategy is designed to boost the coding performance of the enhancement layers (ELs). Specifically, by referencing domain-specific features, the ELs compress spatial features by predictive coding in the spatial domain to effectively remove inter-layer spatial redundancy, and reconstruct angular features by decoder-side generative method in the angular domain to strategically avoid angular compression. Particularly, to produce accurate spatial predictions and reconstruct high-quality LF images, an inter-layer spatial prediction module and a spatial-angular context-aware reconstruction module are presented to collaboratively promote EL compression. Experiments show that the proposed SSLFC-Net effectively supports spatial scalability and achieves state-of-the-art rate-distortion performance.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"570-580"},"PeriodicalIF":3.2,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gray-Mapped NOM-Enhanced SFN: A Broadcast and Broadband Converged Transmission Solution in LTE-Based 5G Broadcast","authors":"Haoyang Li;Dazhi He;Yin Xu;Kewu Peng;Yunfeng Guan;Wenjun Zhang","doi":"10.1109/TBC.2025.3553318","DOIUrl":"https://doi.org/10.1109/TBC.2025.3553318","url":null,"abstract":"Broadcast and broadband converged transmission has emerged as a prominent research focus within broadcast technology. Abundant corresponding studies have been conducted in traditional terrestrial broadcast and 3GPP unicast systems. However, due to issues like system compatibility, traditional terrestrial broadcasts usually reveal insufficient flexibility in transmitting broadband services, and conventional unicast systems always perform inefficiently in delivering broadcast services in scenarios of converged transmission. In addition, as the current Non-Orthogonal Multiplexing (NOM) scheme employed in converged transmission usually does not comply with the Gray-mapping rule, the required codeword-level Successive Interference Cancellation (SIC) algorithm makes the Enhanced Layer (EL) data share the same processing delay as the Core Layer (CL) one, which restricts the variety of EL services. This paper focuses on the physical layer technologies of converged transmission in the 3GPP LTE-based 5G Broadcast system. Due to the inherent good compatibility with both broadcast and broadband systems, LTE-based 5G Broadcast has great potential in realizing the converged transmission of broadcast and broadband. In addition, a novel converged transmission scheme enhanced by Gray-mapped NOM is proposed in this paper, and the corresponding networking architecture, frame structure, transmitting processing, and receiving algorithms are put forward. By significantly improving the performance of the non-SIC receiving algorithm, the proposed Gray-mapped NOM-enhanced SFN (GNeSFN) scheme enables the EL customized services and the CL broadcast services to have processing delays independent from each other, bringing more flexibility to converged transmission. Link-level simulations are carried out with different system configurations and multiple channel scenarios, verifying the effectiveness and feasibility of the proposed scheme.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"426-438"},"PeriodicalIF":3.2,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Localization With DTMB Signal Under Complex Urban Environments","authors":"Tao Zhou;Liang Chen;Jing Sun;Zhenghang Jiao","doi":"10.1109/TBC.2025.3549994","DOIUrl":"https://doi.org/10.1109/TBC.2025.3549994","url":null,"abstract":"Digital multimedia broadcast (DTMB) signal presents a potential opportunity for wireless localization. This paper studies the time of arrival (TOA) estimation based on the DTMB signal for localization. Theoretical analysis of the autocorrelation on the DTMB signal suggested that the DTMB signal has the characteristics for localization. In this paper, we propose software-defineded radio (SDR) receiver based on the DTMB signal for localization. The key innovations of the proposed SDR receiver are as follows: 1) employing a narrow Early-Minus-Late Power Delay Discriminator (nEML) in the delay-locked loop (DLL) to improve the multipath resistance; 2) proposing a multi-state fusion filter to improve the robustness and accuracy of the loop filter; 3) utilizing the carrier-to-noise radio (C/N0) to remove the range observation influenced by heavy non-line of sight (NLOS) environment, thereby reducing the impact of low-quality observations. The static field experiments show that the accuracy of TOA ranging is 1.666m. The motion experiment results show that the root mean square error (RMSE) of the TOA measurements from the DTMB receiver is about 16m, and the RMSE of the DTMB localization is about 17.7m, which shows that the designed receiver can provide relatively reliable localization results when processing DTMB signal in complex urban environments.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"439-452"},"PeriodicalIF":3.2,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey on Recent Advances in Video Coding Technologies and Future Research Directions","authors":"Houbang Guo;Yun Zhou;Hongwei Guo;Zhuqing Jiang;Tian He;Yiyan Wu","doi":"10.1109/TBC.2025.3553306","DOIUrl":"https://doi.org/10.1109/TBC.2025.3553306","url":null,"abstract":"With the evolution of video coding, balancing video compression efficiency with quality has become a critical challenge for researchers and the industry. The development of the next-generation video coding standards, such as Versatile Video Coding (VVC), signifies a significant leap in supporting high-resolution formats including 8K, HDR, and WCG. Currently, machine vision has emerged as a rising research focus, driven by breakthrough in Artificial Intelligence and its growing role in content generation, production, distribution, and storage in multimedia applications. This paper presents a comprehensive survey of the video coding tools in the VVC standard. Additionally, we examine recent research in next-generation video coding, particularly in Beyond VVC and end-to-end coding frameworks. Developments in shared human-machine vision systems are also discussed, emphasizing their relevance in evolving multimedia applications. Finally, this paper provides an outlook on video coding standards, considering their potential to drive next-generation multimedia technologies.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"666-671"},"PeriodicalIF":3.2,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low-Complexity Patch-Based No-Reference Point Cloud Quality Metric Exploiting Weighted Structure and Texture Features","authors":"Michael Neri;Federica Battisti","doi":"10.1109/TBC.2025.3553305","DOIUrl":"https://doi.org/10.1109/TBC.2025.3553305","url":null,"abstract":"During the compression, transmission, and rendering of point clouds, various artifacts are introduced, affecting the quality perceived by the end user. However, evaluating the impact of these distortions on the overall quality is a challenging task. This study introduces PST-PCQA, a no-reference point cloud quality metric based on a low-complexity, learning-based framework. It evaluates point cloud quality by analyzing individual patches, integrating local and global features to predict the Mean Opinion Score. In summary, the process involves extracting features from patches, combining them, and using correlation weights to predict the overall quality. This approach allows us to assess point cloud quality without relying on a reference point cloud, making it particularly useful in scenarios where reference data is unavailable. Experimental tests on three state-of-the-art datasets show good prediction capabilities of PST-PCQA, through the analysis of different feature pooling strategies and its ability to generalize across different datasets. The ablation study confirms the benefits of evaluating quality on a patch-by-patch basis. Additionally, PST-PCQA’s light-weight structure, with a small number of parameters to learn, makes it well-suited for real-time applications and devices with limited computational capacity. For reproducibility purposes, we made code, model, and pretrained weights available at <uri>https://github.com/michaelneri/PST-PCQA</uri>.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"631-640"},"PeriodicalIF":3.2,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10948465","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Frame-Channel Polarization for Improved Reliability in Mobile Video Wireless Transmission","authors":"Zhaoyang Wang;Jiaxi Zhou;Guanghua Liu;Yangyang Liu;Ting Bi;Tao Jiang","doi":"10.1109/TBC.2025.3549991","DOIUrl":"https://doi.org/10.1109/TBC.2025.3549991","url":null,"abstract":"In this paper, we propose a Frame-Channel Polarization (FCP) technique to enhance wireless transmission reliability for low-latency mobile video in Multiple-Input Multiple-Output Orthogonal Frequency-Division Multiplexing (MIMO-OFDM) systems. We begin by analyzing the reliability of video frame transmission, quantified by the Transmission Success Probability (TSP), and derive closed-form TSP expressions under Maximum Ratio Combining (MRC) for a single subcarrier. We also summarize the corresponding TSP formulation for Zero-Forcing (ZF). To extend the analysis to multiple subcarriers, we introduce a dynamic programming approach that computes the TSP for multiple subcarriers based on the single-subcarrier results, thereby reducing computational complexity from exponential to polynomial. Using TSP as a reliability metric, the FCP method dynamically prioritizes subcarrier allocation, assigning more resources to high-priority video frames while allocating fewer subcarriers to lower-priority frames. As a result, the reliability of frame channels becomes polarized, with the degree of polarization directly linked to the reliability requirements of each frame. Experimental results validate the accuracy of the derived TSP expressions for both single and multiple subcarriers and demonstrate that the FCP method significantly improves transmission reliability compared to existing methods, achieving improvements in reliability for low-latency video transmission.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"467-479"},"PeriodicalIF":3.2,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Zhang;Hanling Wang;Qing Bai;Haifeng Liang;Peican Zhu;Gabriel-Miro Muntean;Qing Li
{"title":"VaVLM: Toward Efficient Edge-Cloud Video Analytics With Vision-Language Models","authors":"Yang Zhang;Hanling Wang;Qing Bai;Haifeng Liang;Peican Zhu;Gabriel-Miro Muntean;Qing Li","doi":"10.1109/TBC.2025.3549983","DOIUrl":"https://doi.org/10.1109/TBC.2025.3549983","url":null,"abstract":"The advancement of Large Language Models (LLMs) with vision capabilities in recent years has elevated video analytics applications to new heights. To address the limited computing and bandwidth resources on edge devices, edge-cloud collaborative video analytics has emerged as a promising paradigm. However, most existing edge-cloud video analytics systems are designed for traditional deep learning models (e.g., image classification and object detection), where each model handles a specific task. In this paper, we introduce VaVLM, a novel edge-cloud collaborative video analytics system tailored for Vision-Language Models (VLMs), which can support multiple tasks using a single model. VaVLM aims to enhance the performance of VLM-powered video analytics systems in three key aspects. First, to reduce bandwidth consumption during video transmission, we propose a novel Region-of-Interest (RoI) generation mechanism based on the VLM’s understanding of the task and scene. Second, to lower inference costs, we design a task-oriented inference trigger that processes only a subset of video frames using an optimized inference logic. Third, to improve inference accuracy, the model is augmented with additional information from both the environment and auxiliary analytics models during the inference stage. Extensive experiments on real-world datasets demonstrate that VaVLM achieves an 80.3% reduction in bandwidth consumption and an 89.5% reduction in computational cost compared to baseline methods.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"529-541"},"PeriodicalIF":3.2,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10947590","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}