基于学习的点云几何编码速率控制

2023 Data Compression Conference (DCC) Pub Date : 2023-03-01 DOI:10.1109/DCC55655.2023.00079

Manuela Ruivo, André F. R. Guarda, F. Pereira

{"title":"基于学习的点云几何编码速率控制","authors":"Manuela Ruivo, André F. R. Guarda, F. Pereira","doi":"10.1109/DCC55655.2023.00079","DOIUrl":null,"url":null,"abstract":"Multimedia applications have been evolving towards providing users with more immersive and realistic experiences. A common way to model the light available for the users’ eyes is the so-called plenoptic function – a powerful 7D representation of light. There are three main types of 3D representation models for the plenoptic function, capable of expressing the light information needed to offer 6-Degrees of Freedom (DoF) experiences, namely light fields, meshes, and Point Clouds (PCs). This paper focuses on PCs since they allow representing and processing objects directly in the 3D space, facilitating user interaction and navigation in a multitude of application domains. Since the illusion of real surfaces is provided by high-density point sets, a good quality of experience requires a rather large set of points to represent a single PC, thus originating huge amounts of data to be stored and/or transmitted. Consequently, PC Coding (PCC) with significant compression levels is a must to reduce the PC data to more manageable sizes and bring PC-based applications to practical deployment. The promising results for image coding led the Joint Photographic Experts Group (JPEG) to launch a standardization project especially targeting Deep Learning (DL)-based PCC, with a final Call for Proposals in January 2022. The best performing response to this call [1] became the JPEG Pleno Learning-based PCC Verification Model (VM), which is the seed codec for the final standard. In this codec, the rate may be controlled through a set of coding parameters, largely depending on the specific PC to code, notably its sparsity and homogeneity.","PeriodicalId":209029,"journal":{"name":"2023 Data Compression Conference (DCC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning-based Point Cloud Geometry Coding Rate Control\",\"authors\":\"Manuela Ruivo, André F. R. Guarda, F. Pereira\",\"doi\":\"10.1109/DCC55655.2023.00079\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multimedia applications have been evolving towards providing users with more immersive and realistic experiences. A common way to model the light available for the users’ eyes is the so-called plenoptic function – a powerful 7D representation of light. There are three main types of 3D representation models for the plenoptic function, capable of expressing the light information needed to offer 6-Degrees of Freedom (DoF) experiences, namely light fields, meshes, and Point Clouds (PCs). This paper focuses on PCs since they allow representing and processing objects directly in the 3D space, facilitating user interaction and navigation in a multitude of application domains. Since the illusion of real surfaces is provided by high-density point sets, a good quality of experience requires a rather large set of points to represent a single PC, thus originating huge amounts of data to be stored and/or transmitted. Consequently, PC Coding (PCC) with significant compression levels is a must to reduce the PC data to more manageable sizes and bring PC-based applications to practical deployment. The promising results for image coding led the Joint Photographic Experts Group (JPEG) to launch a standardization project especially targeting Deep Learning (DL)-based PCC, with a final Call for Proposals in January 2022. The best performing response to this call [1] became the JPEG Pleno Learning-based PCC Verification Model (VM), which is the seed codec for the final standard. In this codec, the rate may be controlled through a set of coding parameters, largely depending on the specific PC to code, notably its sparsity and homogeneity.\",\"PeriodicalId\":209029,\"journal\":{\"name\":\"2023 Data Compression Conference (DCC)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 Data Compression Conference (DCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC55655.2023.00079\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Data Compression Conference (DCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC55655.2023.00079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

多媒体应用程序一直在向向用户提供更加身临其境和逼真的体验的方向发展。对用户眼睛可用的光进行建模的一种常用方法是所谓的全视功能——一种强大的7D光表示。全视功能有三种主要类型的3D表示模型，能够表达提供6自由度(DoF)体验所需的光信息，即光场、网格和点云(pc)。本文主要关注pc，因为它们允许在3D空间中直接表示和处理对象，促进用户在众多应用领域的交互和导航。由于真实表面的幻觉是由高密度的点集提供的，所以高质量的体验需要相当大的点集来代表单个PC，从而产生大量需要存储和/或传输的数据。因此，具有显著压缩级别的PC编码(PCC)是将PC数据减少到更易于管理的大小并将基于PC的应用程序带入实际部署的必要条件。在图像编码方面有希望的结果促使联合摄影专家组(JPEG)启动了一个标准化项目，特别是针对基于深度学习(DL)的PCC，并于2022年1月进行最后的提案征集。对此调用的最佳响应[1]是JPEG Pleno基于学习的PCC验证模型(VM)，它是最终标准的种子编解码器。在这种编解码器中，速率可以通过一组编码参数来控制，这主要取决于要编码的特定PC，特别是其稀疏性和同质性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning-based Point Cloud Geometry Coding Rate Control

Multimedia applications have been evolving towards providing users with more immersive and realistic experiences. A common way to model the light available for the users’ eyes is the so-called plenoptic function – a powerful 7D representation of light. There are three main types of 3D representation models for the plenoptic function, capable of expressing the light information needed to offer 6-Degrees of Freedom (DoF) experiences, namely light fields, meshes, and Point Clouds (PCs). This paper focuses on PCs since they allow representing and processing objects directly in the 3D space, facilitating user interaction and navigation in a multitude of application domains. Since the illusion of real surfaces is provided by high-density point sets, a good quality of experience requires a rather large set of points to represent a single PC, thus originating huge amounts of data to be stored and/or transmitted. Consequently, PC Coding (PCC) with significant compression levels is a must to reduce the PC data to more manageable sizes and bring PC-based applications to practical deployment. The promising results for image coding led the Joint Photographic Experts Group (JPEG) to launch a standardization project especially targeting Deep Learning (DL)-based PCC, with a final Call for Proposals in January 2022. The best performing response to this call [1] became the JPEG Pleno Learning-based PCC Verification Model (VM), which is the seed codec for the final standard. In this codec, the rate may be controlled through a set of coding parameters, largely depending on the specific PC to code, notably its sparsity and homogeneity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 Data Compression Conference (DCC)

自引率

0.00%

发文量