Pattern Recognition最新文献

筛选
英文 中文
Trend-aware time series clustering via self-attentive LSTM 基于自关注LSTM的趋势感知时间序列聚类
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-24 DOI: 10.1016/j.patcog.2025.112455
Chongyan Wu, Bin Yu
{"title":"Trend-aware time series clustering via self-attentive LSTM","authors":"Chongyan Wu,&nbsp;Bin Yu","doi":"10.1016/j.patcog.2025.112455","DOIUrl":"10.1016/j.patcog.2025.112455","url":null,"abstract":"<div><div>Time series clustering aims to partition time series into subsets with similar patterns, uncovering their underlying structures and dynamics. This paper proposes a novel clustering method that integrates polynomial curve fitting, an enhanced self-attention mechanism, and a long short-term memory (LSTM) network. First, the Hodrick-Prescott (HP) filter is applied to denoise the raw time series. Then, polynomial curve fitting (PCF) is employed to extract multi-order derivative features at each time point, capturing local trend information and constructing a high-dimensional feature space. An enhanced self-attention LSTM model is designed to encode both raw and trend-based features into a hidden state sequence, enabling the model to capture key patterns and long-range dependencies. Finally, a distance metric based on the hidden states is defined and incorporated into a hierarchical clustering (HC) algorithm. Experiments on several public univariate datasets with long sequences demonstrate that the proposed method outperforms conventional approaches, offering a robust solution for modeling and interpreting complex time series.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112455"},"PeriodicalIF":7.6,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cooperative multi-task learning and reliability assessment for glioma segmentation and IDH genotyping 脑胶质瘤分割和IDH基因分型的合作多任务学习和可靠性评估
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-23 DOI: 10.1016/j.patcog.2025.112467
Meng Li , Du Jiang , Juntong Yun , Rong Liu , Ying Sun , Gongfa Li
{"title":"Cooperative multi-task learning and reliability assessment for glioma segmentation and IDH genotyping","authors":"Meng Li ,&nbsp;Du Jiang ,&nbsp;Juntong Yun ,&nbsp;Rong Liu ,&nbsp;Ying Sun ,&nbsp;Gongfa Li","doi":"10.1016/j.patcog.2025.112467","DOIUrl":"10.1016/j.patcog.2025.112467","url":null,"abstract":"<div><div>The high heterogeneity of gliomas presents significant challenges in distinguishing isocitrate dehydrogenase (IDH) genotypes based on magnetic resonance imaging (MRI) features. To address this issue, we propose a joint optimization framework based on multi-task learning (MLNet), which enables the simultaneous optimization of glioma segmentation and IDH genotype prediction within a unified framework. First, we design a glioma segmentation network based on a CNN-Transformer hybrid architecture to extract glioma features. Second, feature fusion is employed to provide feature support for the IDH genotyping task. A reliability assessment mechanism is introduced to evaluate the IDH genotyping results, determining whether a secondary assessment is necessary. Finally, we construct a multi-task learning loss function and achieve end-to-end joint training through feature sharing across tasks. We evaluate the proposed method on the BraTs2020 dataset, and comparisons with state-of-the-art methods demonstrate that the multi-task learning method offers superior performance.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112467"},"PeriodicalIF":7.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Language model encoded multi-scale feature fusion and transformation for predicting protein-peptide binding sites 语言模型编码多尺度特征融合与转换预测蛋白-肽结合位点
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-23 DOI: 10.1016/j.patcog.2025.112487
Hua Zhang , Pengliang Chen , Xiaoqi Yang , Junhao Wang , Guogen Shan , Bi Chen , Bo Jiang
{"title":"Language model encoded multi-scale feature fusion and transformation for predicting protein-peptide binding sites","authors":"Hua Zhang ,&nbsp;Pengliang Chen ,&nbsp;Xiaoqi Yang ,&nbsp;Junhao Wang ,&nbsp;Guogen Shan ,&nbsp;Bi Chen ,&nbsp;Bo Jiang","doi":"10.1016/j.patcog.2025.112487","DOIUrl":"10.1016/j.patcog.2025.112487","url":null,"abstract":"<div><div>Protein-peptide interactions serve as a pivotal and indispensable role in diverse biological functions and cellular processes. Although recent studies have begun to employ language models for predicting protein-peptide binding sites (PPBS), the majority of previous approaches have persisted in utilizing intricate sequence-based feature engineering or incorporating costly experimental structural information. To overcome these limitations, we develop a novel sequence-based end-to-end PPBS predictor using deep learning, named Language model encoded Multi-scale Feature Fusion and Transformation (LMFFT). The proposed model starts with a single protein language model for comprehensive multi-scale feature extraction, including residue, dipeptide, and fragment-level representations, which are implemented by the dipeptide embedding-based fragment fusion and further enhanced through the dipeptide contextual encoding. Moreover, multi-scale convolutional neural networks are applied to transform multi-scale features by capturing intricate interactions between local and global information. Our LMFFT achieves state-of-the-art performance across three benchmark datasets, outperforming existing sequence-based methods and demonstrating competitive advantages over certain structure-based baselines. This work provides a cost-effective and efficient solution for PPBS prediction, advancing revealing the sequence-function relationship of proteins.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112487"},"PeriodicalIF":7.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards unified molecule-enhanced pathology image representation learning via integrating spatial transcriptomics 通过整合空间转录组学实现统一的分子增强病理图像表示学习
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-23 DOI: 10.1016/j.patcog.2025.112458
Minghao Han , Dingkang Yang , Jiabei Cheng , Xukun Zhang , Zizhi Chen , Haopeng Kuang , Lihua Zhang
{"title":"Towards unified molecule-enhanced pathology image representation learning via integrating spatial transcriptomics","authors":"Minghao Han ,&nbsp;Dingkang Yang ,&nbsp;Jiabei Cheng ,&nbsp;Xukun Zhang ,&nbsp;Zizhi Chen ,&nbsp;Haopeng Kuang ,&nbsp;Lihua Zhang","doi":"10.1016/j.patcog.2025.112458","DOIUrl":"10.1016/j.patcog.2025.112458","url":null,"abstract":"<div><div>Recent advancements in multimodal pre-training have advanced computational pathology, but current visual-language approaches lack molecular perspective and face performance bottlenecks in clinical settings. Here, we introduce a <strong>U</strong>nified <strong>M</strong>olecule-enhanced <strong>P</strong>athology <strong>I</strong>mage <strong>RE</strong>presentation Learning framework (<span><math><mtext>UMPIRE</mtext></math></span>) that enhances the robustness and generalization capabilities of pathology image analysis across diverse tissue types and sequencing platforms. <span><math><mtext>UMPIRE</mtext></math></span> leverages complementary information from gene expression profiles to guide multimodal pre-training, addressing the challenge of distribution shifts between research and clinical environments. To overcome the scarcity of paired data, we collected more than 4 million entries of spatial transcriptomics gene expression to train the gene encoder. <span><math><mtext>UMPIRE</mtext></math></span> aligns modalities across 697K pathology image-gene expression pairs, creating a foundation model that demonstrates superior generalization across multiple sequencing platforms and downstream tasks without additional fine-tuning. Comprehensive evaluation shows <span><math><mtext>UMPIRE</mtext></math></span>’s effectiveness in gene expression prediction, spot classification, and mutation state prediction in whole slide images, with significant improvements over state-of-the-art methods. Our findings demonstrate how molecular data integration enhances visual pattern recognition in computational pathology, providing a resilient approach for bench-to-bedside translation. The code and pre-trained weights are available at <span><span>https://github.com/Hanminghao/Umpire</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112458"},"PeriodicalIF":7.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HMSNet: Hilbert curve enhanced Mamba for real-time semantic segmentation 希尔伯特曲线增强的Mamba实时语义分割
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-22 DOI: 10.1016/j.patcog.2025.112457
Lianyin Jia , Aoxiang Gao , Mengjuan Li , Xiaodong Fu , Haihe Zhou , Jiaman Ding
{"title":"HMSNet: Hilbert curve enhanced Mamba for real-time semantic segmentation","authors":"Lianyin Jia ,&nbsp;Aoxiang Gao ,&nbsp;Mengjuan Li ,&nbsp;Xiaodong Fu ,&nbsp;Haihe Zhou ,&nbsp;Jiaman Ding","doi":"10.1016/j.patcog.2025.112457","DOIUrl":"10.1016/j.patcog.2025.112457","url":null,"abstract":"<div><div>Semantic segmentation is a core technology for vehicle perception of the surrounding environment in autonomous driving. However, existing real-time semantic segmentation models face two major challenges: loss of local detail information and inconsistency of intra-class semantic information. To address these issues, we propose a novel network architecture, HMSNet. The network mainly consists of the following three core modules: the Hilbert curve enhanced Visual Mamba Block (HVM Block), Selective Attention Fusion Module (SAFM), and Multi-scale Context-Aware Module (MCAM). The HVM Block utilizes the Hilbert curve to reduce the dimensionality of two-dimensional images and applies a selective scanning algorithm in Mamba, enabling the network to effectively capture local dependencies while maintaining a global receptive field, thereby optimizing the consistency of intra-class semantic information. The SAFM module effectively merges local detail information from shallow networks with global semantic information from deep networks, alleviating the problem of local detail information loss. Finally, the MCAM module, introduced at the end of the network, enhances the model,s ability to judge contextual information, thereby improving segmentation accuracy. Experimental results show that HMSNet achieves an excellent balance between segmentation accuracy and inference speed on challenging public datasets, including CamVid, Cityscapes, and ADE20K.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112457"},"PeriodicalIF":7.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-view biclustering via non-negative matrix tri-factorisation 通过非负矩阵三因子分解的多视图双聚类
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-20 DOI: 10.1016/j.patcog.2025.112454
Ella S.C. Orme , Theodoulos Rodosthenous , Marina Evangelou
{"title":"Multi-view biclustering via non-negative matrix tri-factorisation","authors":"Ella S.C. Orme ,&nbsp;Theodoulos Rodosthenous ,&nbsp;Marina Evangelou","doi":"10.1016/j.patcog.2025.112454","DOIUrl":"10.1016/j.patcog.2025.112454","url":null,"abstract":"<div><div>Multi-view data is ever more apparent as methods for production, collection and storage of data become more feasible both practically and fiscally. However, not all features are relevant to describe the patterns for all individuals. Multi-view biclustering aims to simultaneously cluster both rows and columns, discovering clusters of rows as well as their view-specific identifying features. A novel multi-view biclustering approach based on non-negative matrix factorisation is proposed named ResNMTF. Demonstrated through extensive experiments on both synthetic and real datasets, ResNMTF successfully identifies both overlapping and non-exhaustive biclusters, without pre-existing knowledge of the number of biclusters present, and is able to incorporate any combination of shared dimensions across views. Further, to address the lack of a suitable bicluster-specific intrinsic measure, the popular silhouette score is extended to the bisilhouette score. The bisilhouette score is demonstrated to align well with known extrinsic measures, and proves useful as a tool for hyperparameter tuning as well as visualisation.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112454"},"PeriodicalIF":7.6,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LayerCLIP: A fine-grained class activation map for weakly supervised semantic segmentation LayerCLIP:用于弱监督语义分割的细粒度类激活映射
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-19 DOI: 10.1016/j.patcog.2025.112452
Lingma Sun , Le Zou , Xianghu Lv, Zhize Wu, Xiaofeng Wang
{"title":"LayerCLIP: A fine-grained class activation map for weakly supervised semantic segmentation","authors":"Lingma Sun ,&nbsp;Le Zou ,&nbsp;Xianghu Lv,&nbsp;Zhize Wu,&nbsp;Xiaofeng Wang","doi":"10.1016/j.patcog.2025.112452","DOIUrl":"10.1016/j.patcog.2025.112452","url":null,"abstract":"<div><div>Weakly supervised semantic segmentation (WSSS) using image-level labels aims to create pseudo-labels leveraging Class Activation Maps (CAM) to train a separate segmentation model. Recent methods that utilize Contrastive Language-Image Pre-training (CLIP) models have achieved significant advancements. These approaches take advantage of CLIP’s capability to identify various categories without requiring additional training. However, due to the limited local information of the final embedding layer, the CAM generated by the CLIP model is still a rough region with an under-activated or over-activated issue. Furthermore, the abundant multi-layer information of CLIP, which plays a vital role in dense prediction, has been ignored. In this paper, we proposed a LayerCLIP model for a fine-grained CAM generation via hierarchical features, which consists of two consecutive components: a dynamic hierarchical CAMs module and an adaptive affinity module. Specifically, the dynamic hierarchical CAMs module utilizes the hierarchical features to produce two complementary CAMs, along with a dynamic strategy to fuse these CAMs. Subsequently, the affinity based on multi-head self-attention is adaptively reweighted to refine CAM by the CAM itself in the adaptive affinity module. LayerCLIP significantly enhances the quality of CAM. Our method achieves a new state-of-the-art performance on PASCAL VOC 2012 (75.1 % mIoU) and MS COCO 2014 (46.9 % mIoU) through extensive benchmark experiments.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112452"},"PeriodicalIF":7.6,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tiny object detection based on dynamic scale-awareness label assignment and contextual enhancement 基于动态尺度感知标签分配和上下文增强的微小目标检测
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-19 DOI: 10.1016/j.patcog.2025.112449
Tianyang Zhang, Xiangrong Zhang, Chaozhuo Hua, Guanchun Wang, Xiao Han, Licheng Jiao
{"title":"Tiny object detection based on dynamic scale-awareness label assignment and contextual enhancement","authors":"Tianyang Zhang,&nbsp;Xiangrong Zhang,&nbsp;Chaozhuo Hua,&nbsp;Guanchun Wang,&nbsp;Xiao Han,&nbsp;Licheng Jiao","doi":"10.1016/j.patcog.2025.112449","DOIUrl":"10.1016/j.patcog.2025.112449","url":null,"abstract":"<div><div>The prosperity of recent object detection can not camouflage the deficiencies of tiny object detection. The generic object detectors suffer a dramatic performance degradation on tiny object detection. For this purpose, we present a tiny object detection approach based on Dynamic scale-awareness label assignment and Contextual enhancement (DCNet), which improves the tiny object detection performance from label assignment and feature enhancement perspectives. Considering the IoU-based label assignment seriously harms the positive samples for tiny objects, we design a Dynamic Scale-Awareness (DSA) label assignment to replace it in the region proposal network. The DSA label assignment adaptively rescales preset anchors and introduces the regression information to better assign the preset anchors for tiny objects. Furthermore, the tiny objects often exhibit weak feature responses due to their poor-quality appearance. Therefore, we propose a contextual enhancement module that aggregates contextual information at different scales to enhance tiny objects’ feature responses. Comprehensive experimental analyses on multiple datasets confirm the effectiveness and good generality of our proposed DCNet in tiny object detection.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112449"},"PeriodicalIF":7.6,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransSTC: transformer tracker meets efficient spatial-temporal cues TransSTC:变压器跟踪器满足有效的时空线索
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-18 DOI: 10.1016/j.patcog.2025.112303
Hong Zhang , Wanli Xing , Yifan Yang , Hanyang Liu , Ding Yuan
{"title":"TransSTC: transformer tracker meets efficient spatial-temporal cues","authors":"Hong Zhang ,&nbsp;Wanli Xing ,&nbsp;Yifan Yang ,&nbsp;Hanyang Liu ,&nbsp;Ding Yuan","doi":"10.1016/j.patcog.2025.112303","DOIUrl":"10.1016/j.patcog.2025.112303","url":null,"abstract":"<div><div>Recently, researchers have started developing trackers using the powerful global modeling capabilities of transformer networks. However, existing transformer trackers usually model all template spatial cues indiscriminately and ignore temporal cues of target state changes. This distracts the tracker’s attention and gradually fails to understand the target’s latest state. Therefore, we propose a new tracker called TransSTC, which explores the effective spatial cues in the template and temporal cues during tracking to improve the tracker’s performance. Specifically, we design the target-aware focused coding network to emphasize the efficient spatial cues in the templates, alleviating the impact of spatial cues with low associations of targets in templates on the tracker’s localization accuracy. Additionally, we employ the multi-temporal template update structure that accurately captures variations in the target’s appearance. Within this structure, the collected samples are assessed for target appearance similarity and environmental interference, followed by a three-level sample selection process to ensure the accurate template update. Finally, we introduce the motion constraint framework to dynamically adjust the classification results based on the target’s historical motion trajectory. Extensive experimental results on seven tracking benchmarks demonstrate that TransSTC achieves competitive tracking performance.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112303"},"PeriodicalIF":7.6,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145096304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic clustering transformer for LiDAR-based 3D object detection 基于lidar的三维目标检测动态聚类变压器
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-18 DOI: 10.1016/j.patcog.2025.112444
Yubo Cui , Zhiheng Li , Zheng Fang
{"title":"Dynamic clustering transformer for LiDAR-based 3D object detection","authors":"Yubo Cui ,&nbsp;Zhiheng Li ,&nbsp;Zheng Fang","doi":"10.1016/j.patcog.2025.112444","DOIUrl":"10.1016/j.patcog.2025.112444","url":null,"abstract":"<div><div>LiDAR perception is a critical task in 3D computer vision. Currently, inspired by the success of vision transformers in 2D images, many LiDAR-based detectors also partition the whole scene point cloud into non-overlapping windows, and perform window attention and window shifting to capture local and global information respectively. While these methods improved performance of LiDAR detection task, they often fail to account for the intrinsic separability of 3D LiDAR point clouds. Unlike 2D images, where objects can overlap and blend into one another, objects in LiDAR are distinct and non-overlapping. In this paper, building upon this insight, we propose the Dynamic Cluster Transformer (DCT), a clustering-based point cloud backbone that incorporates transformer architecture. Our approach is designed to exploit the unique characteristics of LiDAR point clouds, enabling a more efficient 3D feature extraction. Specifically, the DCT architecture comprises two primary modules: Sparse Cluster Generation (SCG) and Cluster Feature Interaction (CFI). The Sparse Cluster Generation is responsible for producing initial sparse cluster features from the entire scene point cloud, providing a basis for local and global feature propagation. The Cluster Feature Interaction then facilitates information propagation between these clusters and surrounding voxels, allowing for a more comprehensive understanding of the spatial relationships. This proposed clustering-based learning process is simple yet effective, conforming to the physical characteristics of LiDAR point clouds. Empirical results demonstrate that DCT achieves state-of-the-art performance on the large-scale Waymo Open Dataset and nuScenes dataset.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112444"},"PeriodicalIF":7.6,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信