The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine

arXiv - EE - Image and Video Processing Pub Date : 2024-09-12 DOI:arxiv-2409.08130

André F. R. GuardaInstituto de Telecomunicações, Lisbon, Portugal, Nuno M. M. RodriguesInstituto de Telecomunicações, Lisbon, PortugalESTG, Politécnico de Leiria, Leiria, Portugal, Fernando PereiraInstituto de Telecomunicações, Lisbon, PortugalInstituto Superior Técnico - Universidade de Lisboa, Lisbon, Portugal

{"title":"The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine","authors":"André F. R. GuardaInstituto de Telecomunicações, Lisbon, Portugal, Nuno M. M. RodriguesInstituto de Telecomunicações, Lisbon, PortugalESTG, Politécnico de Leiria, Leiria, Portugal, Fernando PereiraInstituto de Telecomunicações, Lisbon, PortugalInstituto Superior Técnico - Universidade de Lisboa, Lisbon, Portugal","doi":"arxiv-2409.08130","DOIUrl":null,"url":null,"abstract":"Efficient point cloud coding has become increasingly critical for multiple\napplications such as virtual reality, autonomous driving, and digital twin\nsystems, where rich and interactive 3D data representations may functionally\nmake the difference. Deep learning has emerged as a powerful tool in this\ndomain, offering advanced techniques for compressing point clouds more\nefficiently than conventional coding methods while also allowing effective\ncomputer vision tasks performed in the compressed domain thus, for the first\ntime, making available a common compressed visual representation effective for\nboth man and machine. Taking advantage of this potential, JPEG has recently\nfinalized the JPEG Pleno Learning-based Point Cloud Coding (PCC) standard\noffering efficient lossy coding of static point clouds, targeting both human\nvisualization and machine processing by leveraging deep learning models for\ngeometry and color coding. The geometry is processed directly in its original\n3D form using sparse convolutional neural networks, while the color data is\nprojected onto 2D images and encoded using the also learning-based JPEG AI\nstandard. The goal of this paper is to provide a complete technical description\nof the JPEG PCC standard, along with a thorough benchmarking of its performance\nagainst the state-of-the-art, while highlighting its main strengths and\nweaknesses. In terms of compression performance, JPEG PCC outperforms the\nconventional MPEG PCC standards, especially in geometry coding, achieving\nsignificant rate reductions. Color compression performance is less competitive\nbut this is overcome by the power of a full learning-based coding framework for\nboth geometry and color and the associated effective compressed domain\nprocessing.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Efficient point cloud coding has become increasingly critical for multiple applications such as virtual reality, autonomous driving, and digital twin systems, where rich and interactive 3D data representations may functionally make the difference. Deep learning has emerged as a powerful tool in this domain, offering advanced techniques for compressing point clouds more efficiently than conventional coding methods while also allowing effective computer vision tasks performed in the compressed domain thus, for the first time, making available a common compressed visual representation effective for both man and machine. Taking advantage of this potential, JPEG has recently finalized the JPEG Pleno Learning-based Point Cloud Coding (PCC) standard offering efficient lossy coding of static point clouds, targeting both human visualization and machine processing by leveraging deep learning models for geometry and color coding. The geometry is processed directly in its original 3D form using sparse convolutional neural networks, while the color data is projected onto 2D images and encoded using the also learning-based JPEG AI standard. The goal of this paper is to provide a complete technical description of the JPEG PCC standard, along with a thorough benchmarking of its performance against the state-of-the-art, while highlighting its main strengths and weaknesses. In terms of compression performance, JPEG PCC outperforms the conventional MPEG PCC standards, especially in geometry coding, achieving significant rate reductions. Color compression performance is less competitive but this is overcome by the power of a full learning-based coding framework for both geometry and color and the associated effective compressed domain processing.

查看原文本刊更多论文

基于点云学习的 JPEG Pleno 编码标准：为人类和机器服务

高效的点云编码对于虚拟现实、自动驾驶和数字孪生系统等多种应用越来越重要，在这些应用中，丰富的交互式三维数据表示可能会在功能上起到决定性作用。深度学习已成为这一领域的有力工具，它提供了比传统编码方法更高效的先进点云压缩技术，同时还允许在压缩域中执行有效的计算机视觉任务，从而首次提供了一种对人和机器都有效的通用压缩视觉表示法。利用这一潜力，JPEG 最近确定了基于深度学习模型的 JPEG 点云编码（PCC）标准，提供高效的静态点云有损编码，通过利用几何和颜色编码，同时针对人类视觉和机器处理。几何图形使用稀疏卷积神经网络直接以原始三维形式进行处理，而颜色数据则投射到二维图像上，并使用基于学习的 JPEG AI 标准进行编码。本文的目的是对 JPEG PCC 标准进行完整的技术描述，并对其性能与最先进标准进行全面的基准测试，同时强调其主要优缺点。在压缩性能方面，JPEG PCC 优于传统的 MPEG PCC 标准，特别是在几何编码方面，实现了显著的速率降低。色彩压缩性能的竞争力较弱，但基于学习的几何和色彩完全编码框架以及相关的有效压缩域处理功能克服了这一问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - EE - Image and Video Processing

自引率

0.00%

发文量