Adaptive bridge model for compressed domain point cloud classification

IF 2.4 4区 计算机科学
Abdelrahman Seleem, André F. R. Guarda, Nuno M. M. Rodrigues, Fernando Pereira
{"title":"Adaptive bridge model for compressed domain point cloud classification","authors":"Abdelrahman Seleem, André F. R. Guarda, Nuno M. M. Rodrigues, Fernando Pereira","doi":"10.1186/s13640-024-00631-6","DOIUrl":null,"url":null,"abstract":"<p>The recent adoption of deep learning-based models for the processing and coding of multimedia signals has brought noticeable gains in performance, which have established deep learning-based solutions as the uncontested state-of-the-art both for computer vision tasks, targeting machine consumption, as well as, more recently, coding applications, targeting human visualization. Traditionally, applications requiring both coding and computer vision processing require first decoding the bitstream and then applying the computer vision methods to the decompressed multimedia signals. However, the adoption of deep learning-based solutions enables the use of compressed domain computer vision processing, with gains in performance and computational complexity over the decompressed domain approach. For point clouds (PCs), these gains have been demonstrated in the single available compressed domain computer vision processing solution, named Compressed Domain PC Classifier, which processes JPEG Pleno PC coding (PCC) compressed streams using a PC classifier largely compatible with the state-of-the-art spatial domain PointGrid classifier. However, the available Compressed Domain PC Classifier presents strong limitations by imposing a single, specific input size which is associated to specific JPEG Pleno PCC configurations; this limits the compression performance as these configurations are not ideal for all PCs due to their different characteristics, notably density. To overcome these limitations, this paper proposes the first Adaptive Compressed Domain PC Classifier solution which includes a novel adaptive bridge model that allows to process the JPEG Pleno PCC encoded bit streams using different coding configurations, now maximizing the compression efficiency. Experimental results show that the novel Adaptive Compressed Domain PC Classifier allows JPEG PCC to achieve better compression performance by not imposing a single, specific coding configuration for all PCs, regardless of its different characteristics. Moreover, the added adaptability power can achieve slightly better PC classification performance than the previous Compressed Domain PC Classifier and largely better PC classification performance (and lower number of weights) than the PointGrid PC classifier working in the decompressed domain.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"15 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eurasip Journal on Image and Video Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s13640-024-00631-6","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The recent adoption of deep learning-based models for the processing and coding of multimedia signals has brought noticeable gains in performance, which have established deep learning-based solutions as the uncontested state-of-the-art both for computer vision tasks, targeting machine consumption, as well as, more recently, coding applications, targeting human visualization. Traditionally, applications requiring both coding and computer vision processing require first decoding the bitstream and then applying the computer vision methods to the decompressed multimedia signals. However, the adoption of deep learning-based solutions enables the use of compressed domain computer vision processing, with gains in performance and computational complexity over the decompressed domain approach. For point clouds (PCs), these gains have been demonstrated in the single available compressed domain computer vision processing solution, named Compressed Domain PC Classifier, which processes JPEG Pleno PC coding (PCC) compressed streams using a PC classifier largely compatible with the state-of-the-art spatial domain PointGrid classifier. However, the available Compressed Domain PC Classifier presents strong limitations by imposing a single, specific input size which is associated to specific JPEG Pleno PCC configurations; this limits the compression performance as these configurations are not ideal for all PCs due to their different characteristics, notably density. To overcome these limitations, this paper proposes the first Adaptive Compressed Domain PC Classifier solution which includes a novel adaptive bridge model that allows to process the JPEG Pleno PCC encoded bit streams using different coding configurations, now maximizing the compression efficiency. Experimental results show that the novel Adaptive Compressed Domain PC Classifier allows JPEG PCC to achieve better compression performance by not imposing a single, specific coding configuration for all PCs, regardless of its different characteristics. Moreover, the added adaptability power can achieve slightly better PC classification performance than the previous Compressed Domain PC Classifier and largely better PC classification performance (and lower number of weights) than the PointGrid PC classifier working in the decompressed domain.

Abstract Image

用于压缩域点云分类的自适应桥模型
最近,基于深度学习的多媒体信号处理和编码模型的采用带来了明显的性能提升,这使得基于深度学习的解决方案成为无可争议的最先进解决方案,既适用于以机器消费为目标的计算机视觉任务,也适用于最近以人类可视化为目标的编码应用。传统上,需要同时进行编码和计算机视觉处理的应用首先需要对比特流进行解码,然后将计算机视觉方法应用于解压缩的多媒体信号。不过,采用基于深度学习的解决方案后,就可以使用压缩域计算机视觉处理,在性能和计算复杂度方面都比解压缩域方法有所提高。对于点云(PC)而言,这些优势已在名为 "压缩域 PC 分类器 "的单一压缩域计算机视觉处理解决方案中得到证实,该解决方案使用与最先进的空间域 PointGrid 分类器基本兼容的 PC 分类器处理 JPEG Pleno PC 编码(PCC)压缩流。然而,现有的压缩域 PC 分类器有很大的局限性,因为它强加了与特定 JPEG Pleno PCC 配置相关的单一、特定的输入大小;这限制了压缩性能,因为这些配置因其不同的特性(尤其是密度)而并非适用于所有 PC。为了克服这些限制,本文提出了第一个自适应压缩域 PC 分类器解决方案,其中包括一个新颖的自适应桥接模型,允许使用不同的编码配置处理 JPEG Pleno PCC 编码的比特流,从而最大限度地提高压缩效率。实验结果表明,新颖的自适应压缩域 PC 分类器不对所有 PC 强加单一、特定的编码配置,从而使 JPEG PCC 实现更好的压缩性能,而不管其不同的特性。此外,新增的适应能力可使 PC 分类性能略优于之前的压缩域 PC 分类器,并在很大程度上优于在解压缩域工作的 PointGrid PC 分类器(权重数量更少)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Eurasip Journal on Image and Video Processing
Eurasip Journal on Image and Video Processing Engineering-Electrical and Electronic Engineering
CiteScore
7.10
自引率
0.00%
发文量
23
审稿时长
6.8 months
期刊介绍: EURASIP Journal on Image and Video Processing is intended for researchers from both academia and industry, who are active in the multidisciplinary field of image and video processing. The scope of the journal covers all theoretical and practical aspects of the domain, from basic research to development of application.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信