Pattern Recognition Letters最新文献

筛选
英文 中文
Optimal word order for non-causal text generation with Large Language Models: The Spanish case
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-17 DOI: 10.1016/j.patrec.2025.02.010
Andrea Busto-Castiñeira, Silvia García-Méndez, Francisco de Arriba-Pérez, Francisco J. González-Castaño
{"title":"Optimal word order for non-causal text generation with Large Language Models: The Spanish case","authors":"Andrea Busto-Castiñeira,&nbsp;Silvia García-Méndez,&nbsp;Francisco de Arriba-Pérez,&nbsp;Francisco J. González-Castaño","doi":"10.1016/j.patrec.2025.02.010","DOIUrl":"10.1016/j.patrec.2025.02.010","url":null,"abstract":"<div><div>Natural Language Generation (<span>nlg</span>) popularity has increased owing to the progress in Large Language Models (<span>llm</span>s), with zero-shot inference capabilities. However, most neural systems utilize decoder-only causal (unidirectional) transformer models, which are effective for English but may reduce the richness of languages with less strict word order, subject omission, or different relative clause attachment preferences. This is the first work that analytically addresses optimal text generation order for non-causal language models. We present a novel Viterbi algorithm-based methodology for maximum likelihood word order estimation. We analyze the non-causal most-likelihood order probability for <span>nlg</span> in Spanish and, then, the probability of generating the same phrases with Spanish causal <span>nlg</span>. This comparative analysis reveals that causal <span>nlg</span> prefers English-like <span>svo</span> structures. We also analyze the relationship between optimal generation order and causal left-to-right generation order using Spearman’s rank correlation. Our results demonstrate that the ideal order predicted by the maximum likelihood estimator is not closely related to the causal order and may be influenced by the syntactic structure of the target sentence.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 89-96"},"PeriodicalIF":3.9,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143437601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BGI-Net: Bilayer Graph Inference Network for Low Light Image Enhancement
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-10 DOI: 10.1016/j.patrec.2025.02.001
Sihai Qiao, Tong Wang, Zhanao Xue, Rong Chen
{"title":"BGI-Net: Bilayer Graph Inference Network for Low Light Image Enhancement","authors":"Sihai Qiao,&nbsp;Tong Wang,&nbsp;Zhanao Xue,&nbsp;Rong Chen","doi":"10.1016/j.patrec.2025.02.001","DOIUrl":"10.1016/j.patrec.2025.02.001","url":null,"abstract":"<div><div>In complex industrial environments, enhancing low-light images is crucial for product inspection and fault monitoring. However, current methods often overlook the global structural and local texture similarities in industrial images captured under low-light conditions. To address this issue, we propose a low-light image enhancement framework based on graph convolutional networks (GCNs), i.e., BGI-Net. Considering the common characteristics of industrial images, regions are similar but heavily affected by noise, we designed a denoising module and an enhancement optimization module. The denoising module employs multi-image fusion techniques to efficiently extract information from dimly lit environments. The enhancement optimization module refines images by optimizing graph nodes in both spatial and channel dimensions, leveraging clear image nodes to guide areas with a high signal-to-noise ratio, thereby restoring noise-corrupted details. Extensive qualitative and quantitative evaluations on synthetic and real low-light image datasets demonstrate that our method outperforms state-of-the-art (SoTA) techniques in enhancing the robustness of industrial low-light images.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 29-34"},"PeriodicalIF":3.9,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143378209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast approximate maximum common subgraph computation
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-10 DOI: 10.1016/j.patrec.2025.02.006
Mathias Fuchs, Kaspar Riesen
{"title":"Fast approximate maximum common subgraph computation","authors":"Mathias Fuchs,&nbsp;Kaspar Riesen","doi":"10.1016/j.patrec.2025.02.006","DOIUrl":"10.1016/j.patrec.2025.02.006","url":null,"abstract":"<div><div>The computation of the maximum common subgraph (MCS) is one of the most prevalent problems in graph based data science. However, state-of-the-art algorithms for exact MCS computation have exponential time complexity. Actually, finding the MCS of two general graphs is an NP-complete problem, and thus, the definition of an exact algorithm with polynomial time complexity is only possible if P = NP. In the present paper, we thoroughly compare a novel concept called matching-graph — which is basically defined as the stable core of pairs of graphs — to the MCS. In particular, we research whether these matching-graphs — computable in polynomial time — offer a viable approximation for the MCS. The contribution of this paper is twofold. First, we demonstrate that for specific graphs a matching-graph equals the maximum common edge subgraph and thus its size builds an upper bound of the size of the maximum common induced subgraph. Second, in an experimental evaluation on seven graph datasets, we empirically confirm that the proposed matching-graph computation outperforms existing MCS (approximation) algorithms in terms of both computation time and classification accuracy.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 66-72"},"PeriodicalIF":3.9,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pre-image free graph machine learning with Normalizing Flows
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-10 DOI: 10.1016/j.patrec.2025.02.005
Clément Glédel, Benoît Gaüzère, Paul Honeine
{"title":"Pre-image free graph machine learning with Normalizing Flows","authors":"Clément Glédel,&nbsp;Benoît Gaüzère,&nbsp;Paul Honeine","doi":"10.1016/j.patrec.2025.02.005","DOIUrl":"10.1016/j.patrec.2025.02.005","url":null,"abstract":"<div><div>Nonlinear embeddings are central in machine learning (ML). However, they often suffer from insufficient interpretability, due to the restricted access to the latent space. To improve interpretability, elements of the latent space need to be represented in the input space. The process of finding such inverse transformation is known as the pre-image problem. This challenging task is especially difficult when dealing with complex and discrete data represented by graphs. In this paper, we propose a framework aimed at defining ML models that do not suffer from the pre-image problem. This framework is based on Normalizing Flows (NF), generating the latent space by learning both forward and inverse transformations. From this framework, we propose two specifications to design models working on predictive contexts, namely classification and regression. As a result, our approaches are able to obtain good predictive performances and to generate the pre-image of any element in the latent space. Our experimental results highlight the predictive capabilities and the proficiency in generating graph pre-images, thereby emphasizing the versatility and effectiveness of our approaches for graph machine learning.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 45-51"},"PeriodicalIF":3.9,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143386256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Surgical text-to-image generation
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-10 DOI: 10.1016/j.patrec.2025.02.002
Chinedu Innocent Nwoye , Rupak Bose , Kareem Elgohary , Lorenzo Arboit , Giorgio Carlino , Joël L. Lavanchy , Pietro Mascagni , Nicolas Padoy
{"title":"Surgical text-to-image generation","authors":"Chinedu Innocent Nwoye ,&nbsp;Rupak Bose ,&nbsp;Kareem Elgohary ,&nbsp;Lorenzo Arboit ,&nbsp;Giorgio Carlino ,&nbsp;Joël L. Lavanchy ,&nbsp;Pietro Mascagni ,&nbsp;Nicolas Padoy","doi":"10.1016/j.patrec.2025.02.002","DOIUrl":"10.1016/j.patrec.2025.02.002","url":null,"abstract":"<div><div>Acquiring surgical data for research and development is significantly hindered by high annotation costs and practical and ethical constraints. Synthetically generated images present a valuable alternative. In this work, we explore adapting text-to-image generative models for the surgical domain using the CholecT50 dataset, which provides surgical images annotated with action triplets (instrument, verb, target). We investigate several language models and find T5 to offer more distinct features for differentiating surgical actions on triplet-based textual inputs, and showcasing stronger alignment between long and triplet-based captions. To address challenges in training text-to-image models solely on triplet-based captions without additional input signals, we discover that triplet text embeddings are instrument-centric in the latent space. Leveraging this insight, we design an instrument-based class balancing technique to counteract data imbalance and skewness, improving training convergence. Extending Imagen, a diffusion-based generative model, we develop <em>Surgical Imagen</em> to generate photorealistic and activity-aligned surgical images from triplet-based textual prompts. We assess the model on quality, alignment, reasoning, and knowledge, achieving FID and CLIP scores of 3.7 and 26.8% respectively. Human expert survey shows that participants were highly challenged by the realistic characteristics of the generated samples, demonstrating Surgical Imagen’s effectiveness as a practical alternative to real data collection.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 73-80"},"PeriodicalIF":3.9,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143419462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synthetic image learning: Preserving performance and preventing Membership Inference Attacks
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-08 DOI: 10.1016/j.patrec.2025.02.003
Eugenio Lomurno, Matteo Matteucci
{"title":"Synthetic image learning: Preserving performance and preventing Membership Inference Attacks","authors":"Eugenio Lomurno,&nbsp;Matteo Matteucci","doi":"10.1016/j.patrec.2025.02.003","DOIUrl":"10.1016/j.patrec.2025.02.003","url":null,"abstract":"<div><div>Generative artificial intelligence has transformed the generation of synthetic data, providing innovative solutions to challenges like data scarcity and privacy, which are particularly critical in fields such as medicine. However, the effective use of this synthetic data to train high-performance models remains a significant challenge. This paper addresses this issue by introducing Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers. At the heart of this pipeline is Generative Knowledge Distillation, the proposed technique that significantly improves the quality and usefulness of the information provided to classifiers through a synthetic dataset regeneration and soft labelling mechanism. The KR pipeline has been tested on a variety of datasets, with a focus on six highly heterogeneous medical image datasets, ranging from retinal images to organ scans. The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases. Furthermore, the resulting models show almost complete immunity to Membership Inference Attacks, manifesting privacy properties missing in models trained with conventional techniques.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 52-58"},"PeriodicalIF":3.9,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143386964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel loss function for early event detection based on infinite mixture prototypes for facial expressions
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-08 DOI: 10.1016/j.patrec.2025.01.033
Zhi-Fang Yang, Heng Chu
{"title":"A novel loss function for early event detection based on infinite mixture prototypes for facial expressions","authors":"Zhi-Fang Yang,&nbsp;Heng Chu","doi":"10.1016/j.patrec.2025.01.033","DOIUrl":"10.1016/j.patrec.2025.01.033","url":null,"abstract":"<div><div>Early event detection aims to identify events at their early stages, enabling timely interventions and responses. In our prior work (Wang et al., 2022), a primal estimated sub-gradient solver replaced quadratic programming in the max-margin early event detector (MMED) (Hoai and De la Torre, 2014), enhancing computational efficiency. In another study (Yang and Chiu, 2024), infinite mixture prototypes (IMP) (Allen et al., 2019), integrated with prototypical networks, were applied to early event detection, combining few-shot learning and multimodal data. MMED’s loss function was adapted into a re-scaling function to incorporate early event information, though further refinement was required.</div><div>In this work, we introduce a novel loss function for facial expression early event detection based on IMP. This function increases penalties for late detection and incorporates scalars to better account for the influence of early events on model performance. Experimental results show that the proposed loss function significantly improves accuracy while maintaining consistent event detection length.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 59-65"},"PeriodicalIF":3.9,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143386965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FracNet: An end-to-end deep learning framework for bone fracture detection
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-08 DOI: 10.1016/j.patrec.2025.01.034
Haider A. Alwzwazy, Laith Alzubaidi, Zehui Zhao, Yuantong Gu
{"title":"FracNet: An end-to-end deep learning framework for bone fracture detection","authors":"Haider A. Alwzwazy,&nbsp;Laith Alzubaidi,&nbsp;Zehui Zhao,&nbsp;Yuantong Gu","doi":"10.1016/j.patrec.2025.01.034","DOIUrl":"10.1016/j.patrec.2025.01.034","url":null,"abstract":"<div><div>Fracture detection in medical imaging is crucial for accurate diagnosis and treatment planning in orthopaedic care. Traditional deep learning (DL) models often struggle with small, complex, and varying fracture datasets, leading to unreliable results. We propose FracNet, an end-to-end DL framework specifically designed for bone fracture detection using self-supervised pretraining, feature fusion, attention mechanisms, feature selection, and advanced visualisation tools. FracNet achieves a detection accuracy of 100% on three datasets, consistently outperforming existing methods in terms of accuracy and reliability. Furthermore, FracNet improves decision transparency by providing clear explanations of its predictions, making it a valuable tool for clinicians. FracNet provides high adaptability to new datasets with minimal training requirements. Although its primary focus is fracture detection, FracNet is scalable to various other medical imaging applications.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 1-7"},"PeriodicalIF":3.9,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143378210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Innovative multi-modal approaches to Alzheimer’s disease detection: Transformer hybrid model and adaptive MLP-Mixer
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-07 DOI: 10.1016/j.patrec.2025.01.029
Rahma Kadri , Bassem Bouaziz , Mohamed Tmar , Faiez Gargouri
{"title":"Innovative multi-modal approaches to Alzheimer’s disease detection: Transformer hybrid model and adaptive MLP-Mixer","authors":"Rahma Kadri ,&nbsp;Bassem Bouaziz ,&nbsp;Mohamed Tmar ,&nbsp;Faiez Gargouri","doi":"10.1016/j.patrec.2025.01.029","DOIUrl":"10.1016/j.patrec.2025.01.029","url":null,"abstract":"<div><div>This paper introduces advanced methodologies to enhance Alzheimer’s disease detection. A novel transformer-based hybrid model is proposed, combining adaptive sparse and Multi-head dilated self attention to leverage the unique strengths of both attention mechanisms. Additionally, an innovative adaptive MLP-Mixer model is presented. Several multi-modal fusion techniques are incorporated based on these models. The adaptive MLP-Mixer achieved an accuracy of 96% in mild-level fusion of MRI and DTI modalities. Furthermore, a late fusion method using the same architecture with MRI and sfMRI modalities achieved 98.56% accuracy. For cross-modal fusion, MRI and PET modalities were combined using the transformer-based hybrid model, resulting in an accuracy of 99.98%. Experiments were conducted on the well-known Alzheimer’s Disease Neuroimaging Initiative (ADNI) and Open Access Series of Imaging Studies (OASIS) datasets to assess the effectiveness of the proposed methods. Results demonstrated high performance compared to many recent transformer- and CNN-based approaches.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 15-21"},"PeriodicalIF":3.9,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143378207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adapting a total vertex order to the geometry of a connected component
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2025-02-07 DOI: 10.1016/j.patrec.2025.01.030
Majid Banaeyan, Walter G. Kropatsch
{"title":"Adapting a total vertex order to the geometry of a connected component","authors":"Majid Banaeyan,&nbsp;Walter G. Kropatsch","doi":"10.1016/j.patrec.2025.01.030","DOIUrl":"10.1016/j.patrec.2025.01.030","url":null,"abstract":"<div><div>Irregular graph pyramids are frequently used as powerful tools in pattern recognition and image processing. They are built by merging specific vertices and edges, known as contraction kernels, at each level. Traditional methods often randomly select these kernels, leading to an unpredictable vertex at the top of the pyramid. This paper presents innovative methods to control the selection of contraction kernels, enabling the intentional preservation of a vertex with desired properties at the pyramid’s top. Specifically, we focus on maintaining the center of a connected component (CC) at the pyramid’s apex. For calculating the center of a region, we utilize the eccentricity transform, which is robust against noise. Our approach begins by establishing a total vertex order and then devises solutions for continuous spaces in both 1D and 2D. Subsequently, we adapt these continuous space solutions to discrete spaces, again in both 1D and 2D dimensions. The experimental results demonstrate the efficacy and validity of our proposed methods.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"190 ","pages":"Pages 8-14"},"PeriodicalIF":3.9,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143378211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信