Pattern Recognition Letters最新文献

筛选
英文 中文
GAF-Net: A new automated segmentation method based on multiscale feature fusion and feedback module GAF-Net:一种基于多尺度特征融合和反馈模块的自动分割方法
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-26 DOI: 10.1016/j.patrec.2024.11.025
Long Wen , Yuxing Ye , Lei Zuo
{"title":"GAF-Net: A new automated segmentation method based on multiscale feature fusion and feedback module","authors":"Long Wen ,&nbsp;Yuxing Ye ,&nbsp;Lei Zuo","doi":"10.1016/j.patrec.2024.11.025","DOIUrl":"10.1016/j.patrec.2024.11.025","url":null,"abstract":"<div><div>Surface defect detection (SDD) is the necessary technique to monitor the surface quality of production. However, fine grain defects caused by stress loading, environmental influences, and construction defects is still a challenge to detect. In this research, the convolutional neural network for crack segmentation is developed based on the feature fusion and feedback on the global features and multi-scale feature (GAF-Net). First, a multi-scale feature feedback module (MSFF) is proposed, which uses four different scales to refine local features by fusing high-level and sub-high-level features to perform feedback correction. Secondly, the global feature module (GF) is proposed to generate a fine global information map using local features and adaptive weighted fusion with the correction map for crack detection. Finally, the GAF-Net network with multi-level feature maps is deeply supervised to accelerate GAF-Net and improve the detection accuracy. GAF-Net is trained and experimented on three publicly available pavement crack datasets, and the results show that GAF-Net achieves state-of-the-art results in the IoU segmentation metrics when compared to other deep learning methods (Crackforest: 53.61 %; Crack500: 65.19 %; DeepCrack: 81.63 %).</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 86-92"},"PeriodicalIF":3.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bilateral symmetry-based augmentation method for improved tooth segmentation in panoramic X-rays 基于双侧对称增强的全景x射线牙齿分割改进方法
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-26 DOI: 10.1016/j.patrec.2024.11.023
Sanket Wathore, Subrahmanyam Gorthi
{"title":"Bilateral symmetry-based augmentation method for improved tooth segmentation in panoramic X-rays","authors":"Sanket Wathore,&nbsp;Subrahmanyam Gorthi","doi":"10.1016/j.patrec.2024.11.023","DOIUrl":"10.1016/j.patrec.2024.11.023","url":null,"abstract":"<div><div>Panoramic X-rays are crucial in dental radiology, providing detailed images that are essential for diagnosing and planning treatment for various oral conditions. The advent of automated methods that learn from annotated data promises to significantly aid clinical experts in making accurate diagnoses. However, these methods often require large amounts of annotated data, making the generation of high-quality annotations for panoramic X-rays both challenging and time-consuming. This paper introduces a novel bilateral symmetry-based augmentation method specifically designed to enhance tooth segmentation in panoramic X-rays. By exploiting the inherent bilateral symmetry of these images, our proposed method systematically generates augmented data, leading to substantial improvements in the performance of tooth segmentation models. By increasing the training data size fourfold, our approach proportionately reduces the effort required to manually annotate extensive datasets. These findings highlight the potential of leveraging the symmetrical properties of medical images to enhance model performance and accuracy in dental radiology. The effectiveness of the proposed method is evaluated on three widely adopted deep learning models: U-Net, SE U-Net, and TransUNet. Significant improvements in segmentation accuracy are observed with the proposed augmentation method across all models. For example, the average Dice Similarity Coefficient (DSC) increases by over 8%, reaching 76.7% for TransUNet. Further, comparisons with existing augmentation methods, including rigid transform-based and elastic grid-based techniques, show that the proposed method consistently outperforms them with additional improvements up to 5% in terms of average DSC, with the exact improvement varying depending on the model and training dataset size. We have made the data augmentation codes and tools developed based on our method available at <span><span>https://github.com/wathoresanket/bilateralsymmetrybasedaugmentation</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"188 ","pages":"Pages 1-7"},"PeriodicalIF":3.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmentation of MRI tumors and pelvic anatomy via cGAN-synthesized data and attention-enhanced U-Net 通过cgan合成数据和注意增强U-Net分割MRI肿瘤和骨盆解剖
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-24 DOI: 10.1016/j.patrec.2024.11.003
Mudassar Ali , Haoji Hu , Tong Wu , Maryam Mansoor , Qiong Luo , Weizeng Zheng , Neng Jin
{"title":"Segmentation of MRI tumors and pelvic anatomy via cGAN-synthesized data and attention-enhanced U-Net","authors":"Mudassar Ali ,&nbsp;Haoji Hu ,&nbsp;Tong Wu ,&nbsp;Maryam Mansoor ,&nbsp;Qiong Luo ,&nbsp;Weizeng Zheng ,&nbsp;Neng Jin","doi":"10.1016/j.patrec.2024.11.003","DOIUrl":"10.1016/j.patrec.2024.11.003","url":null,"abstract":"<div><div>Accurate tumor segmentation within MRI images is of great importance for both diagnosis and treatment; however, in many cases, sufficient annotated datasets may not be available. This paper develops a novel approach to the medical image segmentation of tumors in the brain, liver, and pelvic regions within MRI images, by combining an attention-enhanced U-Net model with a cGAN. We introduce three key novelties: a patch discriminator in the cGAN to enhance realism of generated images, attention mechanisms in the U-Net to enhance the accuracy of segmentation, and finally an application to pelvic MRI segmentation, which has seen little exploration. Our method addresses the issue of limited availability of annotated data by generating realistic synthetic images to augment the process of training. Our experimental results on brain, liver, and pelvic MRI datasets show that our approach outperforms the state-of-the-art methods with a Dice Coefficient of 98.61 % for brain MRI, 88.60 % for liver MRI, and 91.93 % for pelvic MRI. We can also observe great increases in the Hausdorff Distance, at especially complex anatomical regions such as tumor boundaries. The proposed combination of synthetic data creation and novel segmentation techniques opens new perspectives for robust medical image segmentation.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 100-106"},"PeriodicalIF":3.9,"publicationDate":"2024-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental component tree contour computation 增量分量树轮廓计算
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-23 DOI: 10.1016/j.patrec.2024.11.019
Dennis J. Silva , Jiří Kosinka , Ronaldo F. Hashimoto , Jos B.T.M. Roerdink , Alexandre Morimitsu , Wonder A.L. Alves
{"title":"Incremental component tree contour computation","authors":"Dennis J. Silva ,&nbsp;Jiří Kosinka ,&nbsp;Ronaldo F. Hashimoto ,&nbsp;Jos B.T.M. Roerdink ,&nbsp;Alexandre Morimitsu ,&nbsp;Wonder A.L. Alves","doi":"10.1016/j.patrec.2024.11.019","DOIUrl":"10.1016/j.patrec.2024.11.019","url":null,"abstract":"<div><div>A component tree is a graph representation that encodes the connected components of the upper or lower level sets of a grayscale image. Consequently, the nodes of a component tree represent binary images of the encoded connected components. There exist various algorithms that efficiently extract information and attributes of nodes of a component tree by incrementally exploiting the subset relation encoding in the tree. However, to the best of our knowledge, there is no such incremental approach to extract the contours of the nodes. In this paper, we propose an efficient incremental method to compute the contours of the nodes of a component tree by counting the edges (sides) of contour pixels. In addition, we discuss our method’s time complexity. We also experimentally show that our proposed method is faster than the standard approach based on node reconstruction.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 115-121"},"PeriodicalIF":3.9,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multichannel image classification based on adaptive attribute profiles 基于自适应属性轮廓的多通道图像分类
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-23 DOI: 10.1016/j.patrec.2024.11.015
Wonder A.L. Alves , Wander S. Campos , Charles F. Gobber , Dennis J. Silva , Ronaldo F. Hashimoto
{"title":"Multichannel image classification based on adaptive attribute profiles","authors":"Wonder A.L. Alves ,&nbsp;Wander S. Campos ,&nbsp;Charles F. Gobber ,&nbsp;Dennis J. Silva ,&nbsp;Ronaldo F. Hashimoto","doi":"10.1016/j.patrec.2024.11.015","DOIUrl":"10.1016/j.patrec.2024.11.015","url":null,"abstract":"<div><div>Morphological Attribute Profiles serve as powerful tools for extracting meaningful features from remote sensing data. The construction of Morphological Attribute Profiles relies on two primary parameters: the choice of attribute type and the definition of a numerical threshold sequence. However, selecting an appropriate threshold sequence can be a difficult task, as an inappropriate choice can lead to an uninformative feature space. In this paper, we propose a semi-automatic approach based on the theory of Maximally Stable Extremal Regions to address this challenge. Our approach takes an increasing attribute type and an initial sequence of thresholds as input and locally adjusts threshold values based on region stability within the image. Experimental results demonstrate that our method significantly increases classification accuracy through the refinement of threshold values.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 107-114"},"PeriodicalIF":3.9,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generation of super-resolution for medical image via a self-prior guided Mamba network with edge-aware constraint 基于边缘感知约束的自先验引导Mamba网络生成超分辨率医学图像
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-22 DOI: 10.1016/j.patrec.2024.11.020
Zexin Ji , Beiji Zou , Xiaoyan Kui , Hua Li , Pierre Vera , Su Ruan
{"title":"Generation of super-resolution for medical image via a self-prior guided Mamba network with edge-aware constraint","authors":"Zexin Ji ,&nbsp;Beiji Zou ,&nbsp;Xiaoyan Kui ,&nbsp;Hua Li ,&nbsp;Pierre Vera ,&nbsp;Su Ruan","doi":"10.1016/j.patrec.2024.11.020","DOIUrl":"10.1016/j.patrec.2024.11.020","url":null,"abstract":"<div><div>Existing deep learning-based super-resolution generation approaches usually depend on the backbone of convolutional neural networks (CNNs) or Transformers. CNN-based approaches are unable to model long-range dependencies, whereas Transformer-based approaches encounter significant computational burdens due to quadratic complexity in calculations. Moreover, high-frequency texture details in images generated by existing approaches still remain indistinct, posing a major challenge in super-resolution tasks. To overcome these problems, we propose a self-prior guided Mamba network with edge-aware constraint (SEMambaSR) for medical image super-resolution. Recently, State Space Models (SSMs), notably Mamba, have gained prominence for the ability to efficiently model long-range dependencies with low complexity. In this paper, we propose to integrate Mamba into the Unet network allowing to extract multi-scale local and global features to generate high-quality super-resolution images. Additionally, we introduce perturbations by randomly adding a brightness window to the input image, enabling the network to mine the self-prior information of the image. We also design an improved 2D-Selective-Scan (ISS2D) module to learn and adaptively fuse multi-directional long-range dependencies in image features to enhance feature representation. An edge-aware constraint is exploited to learn the multi-scale edge information from encoder features for better synthesis of texture boundaries. Our qualitative and quantitative experimental findings indicate superior super-resolution performance over current methods on IXI and BraTS2021 medical datasets. Specifically, our approach achieved a PSNR of 33.44 dB and an SSIM of 0.9371 on IXI, and a PSNR of 41.99 dB and an SSIM of 0.9846 on BraTS2021, both for 2<span><math><mo>×</mo></math></span> upsampling. The downstream vision task on brain tumor segmentation, using a U-Net network, also reveals the effectiveness of our approach, with a mean Dice Score of 57.06% on the BraTS2021 dataset.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 93-99"},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142746067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prototypical class-wise test-time adaptation 原型类测试时间适应性
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-22 DOI: 10.1016/j.patrec.2024.10.011
Hojoon Lee , Seunghwan Lee , Inyoung Jung , Sungeun Hong
{"title":"Prototypical class-wise test-time adaptation","authors":"Hojoon Lee ,&nbsp;Seunghwan Lee ,&nbsp;Inyoung Jung ,&nbsp;Sungeun Hong","doi":"10.1016/j.patrec.2024.10.011","DOIUrl":"10.1016/j.patrec.2024.10.011","url":null,"abstract":"<div><div>Test-time adaptation (TTA) refines pre-trained models during deployment, enabling them to effectively manage new, previously unseen data. However, existing TTA methods focus mainly on global domain alignment, which reduces domain-level gaps but often leads to suboptimal performance. This is because they fail to explicitly consider class-wise alignment, resulting in errors when reliable pseudo-labels are unavailable and source domain samples are inaccessible. In this study, we propose a prototypical class-wise test-time adaptation method, which consists of class-wise prototype adaptation and reliable pseudo-labeling. A main challenge in this approach is the lack of direct access to source domain samples. We leverage the class-specific knowledge contained in the weights of the pre-trained model. To construct class prototypes from the unlabeled target domain, we further introduce a methodology to enhance the reliability of pseudo labels. Our method is adaptable to various models and has been extensively validated, consistently outperforming baselines across multiple benchmark datasets.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 49-55"},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detailed evaluation of a population-wise personalization approach to generate synthetic myocardial infarct images 详细评估了一种基于人群的个性化方法来生成合成心肌梗死图像
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-22 DOI: 10.1016/j.patrec.2024.11.017
Anastasia Konik , Patrick Clarysse , Nicolas Duchateau
{"title":"Detailed evaluation of a population-wise personalization approach to generate synthetic myocardial infarct images","authors":"Anastasia Konik ,&nbsp;Patrick Clarysse ,&nbsp;Nicolas Duchateau","doi":"10.1016/j.patrec.2024.11.017","DOIUrl":"10.1016/j.patrec.2024.11.017","url":null,"abstract":"<div><div>Personalization of biophysical models to real data is essential to achieve realistic simulations or generate relevant synthetic populations. However, some of these models involve randomness, which poses two challenges: they do not allow the standard personalization to each individual’s data and they lack an analytical formulation required for optimization. In previous work, we introduced a population-based personalization strategy which overcomes these challenges and demonstrated its feasibility on simple 2D geometrical models of myocardial infarct. The method consists in matching the distributions of the synthetic and real populations, quantified through the Kullback–Leibler (KL) divergence. Personalization is achieved with a gradient-free algorithm (CMA-ES), which generates sets of candidate solutions represented by their covariance matrix, whose coefficients evolve until the synthetic and real data are matched. However, the robustness of this strategy regarding settings and more complex data was not challenged. In this work, we specifically address these points, with (i) an improved design, (ii) a thorough evaluation on crucial aspects of the personalization process, including hyperparameters and initialization, and (iii) the application to 3D data. Despite some limits of the simple geometrical models used, our method is able to capture the main characteristics of the real data, as demonstrated both on 2D and 3D segmented late Gadolinium images of 123 subjects with acute myocardial infarction.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"188 ","pages":"Pages 8-14"},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving ViT interpretability with patch-level mask prediction 利用光斑级掩膜预测提高 ViT 的可解释性
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-22 DOI: 10.1016/j.patrec.2024.11.018
Junyong Kang , Byeongho Heo , Junsuk Choe
{"title":"Improving ViT interpretability with patch-level mask prediction","authors":"Junyong Kang ,&nbsp;Byeongho Heo ,&nbsp;Junsuk Choe","doi":"10.1016/j.patrec.2024.11.018","DOIUrl":"10.1016/j.patrec.2024.11.018","url":null,"abstract":"<div><div>Vision Transformers (ViTs) have demonstrated remarkable performances on various computer vision tasks. Attention scores are often used to explain the decision-making process of ViTs, showing which tokens are more important than others. However, the attention scores have several limitations as an explanation for ViT, such as conflicting with other explainable methods or highlighting unrelated tokens. In order to address this limitation, we propose a novel method for generating a visual explanation map from ViTs. Unlike previous approaches that rely on attention scores, our method leverages ViT features and conducts a single forward pass through our Patch-level Mask prediction (PM) module. Our visual explanation map provides class-dependent and probabilistic interpretation that can identify crucial regions of model decisions. Experimental results demonstrate that our approach outperforms previous techniques in both classification and interpretability aspects. Additionally, it can be applied to the weakly-supervised object localization (WSOL) tasks using pseudo mask labels. Our method requires no extra parameters and necessitates minimal locality supervision, utilizing less than 1% of the ImageNet-1k training dataset.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 73-79"},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GANzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations GANzzle++:潜在空间表征中从局部到全局分配的拼图游戏生成方法
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-11-19 DOI: 10.1016/j.patrec.2024.11.010
Davide Talon , Alessio Del Bue , Stuart James
{"title":"GANzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations","authors":"Davide Talon ,&nbsp;Alessio Del Bue ,&nbsp;Stuart James","doi":"10.1016/j.patrec.2024.11.010","DOIUrl":"10.1016/j.patrec.2024.11.010","url":null,"abstract":"<div><div>Jigsaw puzzles are a popular and enjoyable pastime that humans can easily solve, even with many pieces. However, solving a jigsaw is a combinatorial problem, and the space of possible solutions is exponential in the number of pieces, intractable for pairwise solutions. In contrast to the classical pairwise local matching of pieces based on edge heuristics, we estimate an approximate solution image, i.e., a <em>mental image</em>, of the puzzle and exploit it to guide the placement of pieces as a piece-to-global assignment problem. Therefore, from unordered pieces, we consider conditioned generation approaches, including Generative Adversarial Networks (GAN) models, Slot Attention (SA) and Vision Transformers (ViT), to recover the solution image. Given the generated solution representation, we cast the jigsaw solving as a 1-to-1 assignment matching problem using Hungarian attention, which places pieces in corresponding positions in the global solution estimate. Results show that the newly proposed GANzzle-SA and GANzzle-VIT benefit from the early fusion strategy where pieces are jointly compressed and gathered for global structure recovery. A single deep learning model generalizes to puzzles of different sizes and improves the performances by a large margin. Evaluated on PuzzleCelebA and PuzzleWikiArts, our approaches bridge the gap of deep learning strategies with respect to optimization-based classic puzzle solvers.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"187 ","pages":"Pages 35-41"},"PeriodicalIF":3.9,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信