André F. R. GuardaInstituto de Telecomunicações, Lisbon, Portugal, Nuno M. M. RodriguesInstituto de Telecomunicações, Lisbon, PortugalESTG, Politécnico de Leiria, Leiria, Portugal, Fernando PereiraInstituto de Telecomunicações, Lisbon, PortugalInstituto Superior Técnico - Universidade de Lisboa, Lisbon, Portugal
{"title":"The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine","authors":"André F. R. GuardaInstituto de Telecomunicações, Lisbon, Portugal, Nuno M. M. RodriguesInstituto de Telecomunicações, Lisbon, PortugalESTG, Politécnico de Leiria, Leiria, Portugal, Fernando PereiraInstituto de Telecomunicações, Lisbon, PortugalInstituto Superior Técnico - Universidade de Lisboa, Lisbon, Portugal","doi":"arxiv-2409.08130","DOIUrl":null,"url":null,"abstract":"Efficient point cloud coding has become increasingly critical for multiple\napplications such as virtual reality, autonomous driving, and digital twin\nsystems, where rich and interactive 3D data representations may functionally\nmake the difference. Deep learning has emerged as a powerful tool in this\ndomain, offering advanced techniques for compressing point clouds more\nefficiently than conventional coding methods while also allowing effective\ncomputer vision tasks performed in the compressed domain thus, for the first\ntime, making available a common compressed visual representation effective for\nboth man and machine. Taking advantage of this potential, JPEG has recently\nfinalized the JPEG Pleno Learning-based Point Cloud Coding (PCC) standard\noffering efficient lossy coding of static point clouds, targeting both human\nvisualization and machine processing by leveraging deep learning models for\ngeometry and color coding. The geometry is processed directly in its original\n3D form using sparse convolutional neural networks, while the color data is\nprojected onto 2D images and encoded using the also learning-based JPEG AI\nstandard. The goal of this paper is to provide a complete technical description\nof the JPEG PCC standard, along with a thorough benchmarking of its performance\nagainst the state-of-the-art, while highlighting its main strengths and\nweaknesses. In terms of compression performance, JPEG PCC outperforms the\nconventional MPEG PCC standards, especially in geometry coding, achieving\nsignificant rate reductions. Color compression performance is less competitive\nbut this is overcome by the power of a full learning-based coding framework for\nboth geometry and color and the associated effective compressed domain\nprocessing.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Efficient point cloud coding has become increasingly critical for multiple
applications such as virtual reality, autonomous driving, and digital twin
systems, where rich and interactive 3D data representations may functionally
make the difference. Deep learning has emerged as a powerful tool in this
domain, offering advanced techniques for compressing point clouds more
efficiently than conventional coding methods while also allowing effective
computer vision tasks performed in the compressed domain thus, for the first
time, making available a common compressed visual representation effective for
both man and machine. Taking advantage of this potential, JPEG has recently
finalized the JPEG Pleno Learning-based Point Cloud Coding (PCC) standard
offering efficient lossy coding of static point clouds, targeting both human
visualization and machine processing by leveraging deep learning models for
geometry and color coding. The geometry is processed directly in its original
3D form using sparse convolutional neural networks, while the color data is
projected onto 2D images and encoded using the also learning-based JPEG AI
standard. The goal of this paper is to provide a complete technical description
of the JPEG PCC standard, along with a thorough benchmarking of its performance
against the state-of-the-art, while highlighting its main strengths and
weaknesses. In terms of compression performance, JPEG PCC outperforms the
conventional MPEG PCC standards, especially in geometry coding, achieving
significant rate reductions. Color compression performance is less competitive
but this is overcome by the power of a full learning-based coding framework for
both geometry and color and the associated effective compressed domain
processing.