Knowledge-Based Systems最新文献

筛选
英文 中文
Multivariate time series generation based on dual-channel Transformer conditional GAN for industrial remaining useful life prediction 基于双通道变压器条件 GAN 的多变量时间序列生成,用于工业剩余使用寿命预测
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112749
Zhizheng Zhang, Hui Gao, Wenxu Sun, Wen Song, Qiqiang Li
{"title":"Multivariate time series generation based on dual-channel Transformer conditional GAN for industrial remaining useful life prediction","authors":"Zhizheng Zhang,&nbsp;Hui Gao,&nbsp;Wenxu Sun,&nbsp;Wen Song,&nbsp;Qiqiang Li","doi":"10.1016/j.knosys.2024.112749","DOIUrl":"10.1016/j.knosys.2024.112749","url":null,"abstract":"<div><div>Remaining useful life (RUL) prediction is a key enabler of predictive maintenance. While deep learning based prediction methods have made great progress, the data imbalance issue caused by limited run-to-failure data severely undermines their performance. Some recent works employ generative adversarial network (GAN) to tackle this issue. However, most GAN-based generative methods have difficulties in simultaneously extracting correlations of different time steps and sensors. In this paper, we propose dual-channel Transformer conditional GAN (DCTC-GAN), a novel multivariate time series (MTS) generation framework, to generate high-quality MTS to enhance deep learning based RUL prediction models. We design a novel dual-channel Transformer architecture to construct the generator and discriminator, which consists of a temporal encoder and a spatial encoder that work in parallel to automatically pay different attention to different time steps and sensors. Based on this, DCTC-GAN can directly extract the long-distance temporal relations of different time steps while capturing the spatial correlations of different sensors to synthesize high-quality MTS data. Experimental analysis on widely used turbofan engine dataset and FEMTO bearing dataset demonstrates that our DCTC-GAN significantly enhances the performance of existing deep learning models for RUL prediction, without changing its structure, and exceeds the capabilities of current representative generative methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"308 ","pages":"Article 112749"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142704950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can question-texts improve the recognition of handwritten mathematical expressions in respondents’ solutions? 问题文本能否提高答卷人答案中手写数学表达式的识别率?
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112731
Ting Zhang, Xinxin Jin, Xiaoyang Ma, Xinzi Peng, Yiyang Zhao, Jinzheng Liu, Xinguo Yu
{"title":"Can question-texts improve the recognition of handwritten mathematical expressions in respondents’ solutions?","authors":"Ting Zhang,&nbsp;Xinxin Jin,&nbsp;Xiaoyang Ma,&nbsp;Xinzi Peng,&nbsp;Yiyang Zhao,&nbsp;Jinzheng Liu,&nbsp;Xinguo Yu","doi":"10.1016/j.knosys.2024.112731","DOIUrl":"10.1016/j.knosys.2024.112731","url":null,"abstract":"<div><div>The accurate recognition of respondents’ handwritten solutions is important for implementing intelligent diagnosis and tutoring. This task is significantly challenging because of scribbled and irregular writing, especially when handling primary or secondary students whose handwriting has not yet been fully developed. Recognition becomes difficult in such cases even for humans relying only on the visual signals of handwritten content without any context. However, despite decades of work on handwriting recognition, few studies have explored the idea of utilizing external information (question priors) to improve the accuracy. Based on the correlation between questions and solutions, this study aims to explore whether question-texts can improve the recognition of handwritten mathematical expressions (HMEs) in respondents’ solutions. Based on the encoder–decoder framework, which is the mainstream method for HME recognition, we propose two models for fusing question-text signals and handwriting-vision signals at the encoder and decoder stages, respectively. The first, called encoder-fusion, adopts a static query to implement the interaction between two modalities at the encoder phase, and to better catch and interpret the interaction, a fusing method based on a dynamic query at the decoder stage, called decoder-attend is proposed. These two models were evaluated on a self-collected dataset comprising approximately 7k samples and achieved accuracies of 62.61% and 64.20%, respectively, at the expression level. The experimental results demonstrated that both models outperformed the baseline model, which utilized only visual information. The encoder fusion achieved results similar to those of other state-of-the-art methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"307 ","pages":"Article 112731"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142698240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mutual information-driven self-supervised point cloud pre-training 互信息驱动的自监督点云预训练
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112741
Weichen Xu , Tianhao Fu , Jian Cao , Xinyu Zhao , Xinxin Xu , Xixin Cao , Xing Zhang
{"title":"Mutual information-driven self-supervised point cloud pre-training","authors":"Weichen Xu ,&nbsp;Tianhao Fu ,&nbsp;Jian Cao ,&nbsp;Xinyu Zhao ,&nbsp;Xinxin Xu ,&nbsp;Xixin Cao ,&nbsp;Xing Zhang","doi":"10.1016/j.knosys.2024.112741","DOIUrl":"10.1016/j.knosys.2024.112741","url":null,"abstract":"<div><div>Learning universal representations from unlabeled 3D point clouds is essential to improve the generalization and safety of autonomous driving. Generative self-supervised point cloud pre-training with low-level features as pretext tasks is a mainstream paradigm. However, from the perspective of mutual information, this approach is constrained by spatial information and entangled representations. In this study, we propose a generalized generative self-supervised point cloud pre-training framework called GPICTURE. High-level features were used as an additional pretext task to enhance the understanding of semantic information. Considering the varying difficulties caused by the discrimination of voxel features, we designed inter-class and intra-class discrimination-guided masking (I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Mask) to set the masking ratio adaptively. Furthermore, to ensure a hierarchical and stable reconstruction process, centered kernel alignment-guided hierarchical reconstruction and differential-gated progressive learning were employed to control multiple reconstruction tasks. Complete theoretical analyses demonstrated that high-level features can enhance the mutual information between latent features and high-level features, as well as the input point cloud. On Waymo, nuScenes, and SemanticKITTI, we achieved a 75.55% mAP for 3D object detection, 79.7% mIoU for 3D semantic segmentation, and 18.8% mIoU for occupancy prediction. Specifically, with only 50% of the fine-tuning data required, the performance of GPICURE was close to that of training from scratch with 100% of the fine-tuning data. In addition, consistent visualization with downstream tasks and a 57% reduction in weight disparity demonstrated a better fine-tuning starting point. The project page is hosted at <span><span>https://gpicture-page.github.io/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"307 ","pages":"Article 112741"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142698128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data augmentation based on large language models for radiological report classification 基于大语言模型的数据扩增,用于放射学报告分类
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112745
Jaime Collado-Montañez, María-Teresa Martín-Valdivia, Eugenio Martínez-Cámara
{"title":"Data augmentation based on large language models for radiological report classification","authors":"Jaime Collado-Montañez,&nbsp;María-Teresa Martín-Valdivia,&nbsp;Eugenio Martínez-Cámara","doi":"10.1016/j.knosys.2024.112745","DOIUrl":"10.1016/j.knosys.2024.112745","url":null,"abstract":"<div><div>The International Classification of Diseases (ICD) is fundamental in the field of healthcare as it provides a standardized framework for the classification and coding of medical diagnoses and procedures, enabling the understanding of international public health patterns and trends. However, manually classifying medical reports according to this standard is a slow, tedious and error-prone process, which shows the need for automated systems to offload the healthcare professional of this task and to reduce the number of errors. In this paper, we propose an automated classification system based on Natural Language Processing to analyze radiological reports and classify them according to the ICD-10. Since the specialized use of the language of radiological reports and the usual unbalanced distribution of medical report sets, we propose a methodology grounded in leveraging large language models for augmenting the data of unrepresented classes and adapting the classification language models to the specific use of the language of radiological reports. The results show that the proposed methodology enhances the classification performance on the CARES corpus of radiological reports.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"308 ","pages":"Article 112745"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Individualized image steganography method with Dynamic Separable Key and Adaptive Redundancy Anchor
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112729
Junchao Zhou, Yao Lu, Guangming Lu
{"title":"Individualized image steganography method with Dynamic Separable Key and Adaptive Redundancy Anchor","authors":"Junchao Zhou,&nbsp;Yao Lu,&nbsp;Guangming Lu","doi":"10.1016/j.knosys.2024.112729","DOIUrl":"10.1016/j.knosys.2024.112729","url":null,"abstract":"<div><div>Image steganography hides several secret images into a single cover image to produce a stego image. For transmission security, the stego image is visually indistinguishable from the cover image. Furthermore, for effective transmission of secret information, the receivers should recover the secret images with high quality. With the increasing steganography capacity, a stego image containing many secret images is transmitted through public channels. However, in the existing image steganography methods, all the secret images are usually revealed without quarantine among various recipients. This problem casts a threat to security in the recovery process. In order to overcome this issue, we propose the Individualized Image Steganography (<strong>IIS</strong>) Method with Dynamic Separable Key (DSK) and Adaptive Redundancy Anchor (ARA). Specifically, in the process of hiding secret images, the proposed DSK dynamically generates a global key and a local key and appropriately fuses them together. In the same batch of transmission, all recipients share the same global key, but each has a different local key. Only by matching both the global key and the local key simultaneously, can the secret image be restored by the specific receiver, which makes the secret image individualized for the target recipient. Additionally, in the process of revealing secret images, the proposed ARA learns the adaptive redundancy anchor for the inverse training to drive the input redundancy of revealing (backward) process and output redundancy of hiding (forward) process to be close. This achieves a better trade-off between the performances of hiding and revealing processes, and further enhances both the quality of restored secret images and stego images. Jointly using the DSK and ARA, a series of experiments have verified that our <strong>IIS</strong> method has achieved satisfactory performance improvements in extensive aspects. Code is available in <span><span>https://github.com/Revive624/Individualized-Invertible-Steganography</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"309 ","pages":"Article 112729"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142748703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A transformer based visual tracker with restricted token interaction and knowledge distillation 基于变换器的视觉跟踪器,具有受限标记交互和知识提炼功能
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112736
Nian Liu, Yi Zhang
{"title":"A transformer based visual tracker with restricted token interaction and knowledge distillation","authors":"Nian Liu,&nbsp;Yi Zhang","doi":"10.1016/j.knosys.2024.112736","DOIUrl":"10.1016/j.knosys.2024.112736","url":null,"abstract":"<div><div>Recently, one-stream pipelines have made significant progress in visual object tracking (VOT), where the template and search images interact in early stages. However, one-stream pipelines have a potential problem: They treat the object and the background equally (or other irrelevant parts), leading to weak discriminability of the extracted features. To remedy this issue, a restricted token interaction module based on asymmetric attention mechanism is proposed in this paper, which divides the search image into valuable part and other part. Only the valuable part is selected for cross-attention with the template so as to better distinguish the object from the background, which finally improves the localization accuracy and robustness. In addition, to avoid heavy computational overhead, we utilize logit distillation and localization distillation methods to optimize the outputs of the classification and regression heads respectively. At the same time, we separate the distillation regions and apply different knowledge distillation methods in different regions to effectively determine which regions are most beneficial for classification or localization learning. Extensive experiments have been conducted on mainstream datasets in which our tracker (dubbed RIDTrack) has achieved appealing results while meeting the real-time requirement.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"307 ","pages":"Article 112736"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142698125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-source partial domain adaptation with Gaussian-based dual-level weighting for PPG-based heart rate estimation
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112769
Jihyun Kim , Hansam Cho , Minjung Lee , Seoung Bum Kim
{"title":"Multi-source partial domain adaptation with Gaussian-based dual-level weighting for PPG-based heart rate estimation","authors":"Jihyun Kim ,&nbsp;Hansam Cho ,&nbsp;Minjung Lee ,&nbsp;Seoung Bum Kim","doi":"10.1016/j.knosys.2024.112769","DOIUrl":"10.1016/j.knosys.2024.112769","url":null,"abstract":"<div><div>Photoplethysmography (PPG) signals from wearable devices have expanded the accessibility of heart rate estimation. Recent advances in deep learning have significantly improved the generalizability of heart rate estimation from PPG signals. However, these models exhibit performance degradation when used for new subjects with different PPG distributions. Although previous studies have attempted subject-specific training and fine-tuning techniques, they require labeled data for each new subject, limiting their practicality. In response, we explore the application of domain adaptation techniques using only unlabeled PPG signals from the target subject. However, naive domain adaptation approaches do not adequately account for the variability in PPG signals among different subjects in the training dataset. Furthermore, they overlook the possibility that the heart rate range of the target subject may only partially overlap with that of the source subjects. To address these limitations, we propose a novel multi-source partial domain adaptation method, GAussian-based dUaL-level weighting (GAUL), designed for the PPG-based heart rate estimation, formulated as a regression task. GAUL considers and adjusts the contribution of relevant source data at the domain and sample levels during domain adaptation. The experimental results on three benchmark datasets demonstrate that our method outperforms existing domain adaptation approaches, enhancing the heart rate estimation accuracy for new subjects without requiring additional labeled data. The code is available at: <span><span>https://github.com/Im-JihyunKim/GAUL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"309 ","pages":"Article 112769"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142748611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Affective body expression recognition framework based on temporal and spatial fusion features 基于时空融合特征的肢体表情情感识别框架
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112744
Tao Wang , Shuang Liu , Feng He , Minghao Du , Weina Dai , Yufeng Ke , Dong Ming
{"title":"Affective body expression recognition framework based on temporal and spatial fusion features","authors":"Tao Wang ,&nbsp;Shuang Liu ,&nbsp;Feng He ,&nbsp;Minghao Du ,&nbsp;Weina Dai ,&nbsp;Yufeng Ke ,&nbsp;Dong Ming","doi":"10.1016/j.knosys.2024.112744","DOIUrl":"10.1016/j.knosys.2024.112744","url":null,"abstract":"<div><div>Affective body expression recognition technology enables machines to interpret non-verbal emotional signals from human movements, which is crucial for facilitating natural and empathetic human–machine interaction (HCI). This work proposes a new framework for emotion recognition from body movements, providing a universal and effective solution for decoding the temporal–spatial mapping between emotions and body expressions. Compared with previous studies, our approach extracted interpretable temporal and spatial features by constructing a body expression energy model (BEEM) and a multi-input symmetric positive definite matrix network (MSPDnet). In particular, the temporal features extracted from the BEEM reveal the energy distribution, dynamical complexity, and frequency activity of the body expression under different emotions, while the spatial features obtained by MSPDnet capture the spatial Riemannian properties between body joints. Furthermore, this paper introduces an attentional temporal–spatial feature fusion (ATSFF) algorithm to adaptively fuse temporal and spatial features with different semantics and scales, significantly improving the discriminability and generalizability of the fused features. The proposed method achieves recognition accuracies over 90% across four public datasets, outperforming most state-of-the-art approaches.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"308 ","pages":"Article 112744"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142720588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DTDA: Dual-channel Triple-to-quintuple Data Augmentation for Comparative Opinion Quintuple Extraction DTDA:用于比较意见五元提取的双通道三元对五元数据增强技术
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112734
Qingting Xu , Kaisong Song , Yangyang Kang , Chaoqun Liu , Yu Hong , Guodong Zhou
{"title":"DTDA: Dual-channel Triple-to-quintuple Data Augmentation for Comparative Opinion Quintuple Extraction","authors":"Qingting Xu ,&nbsp;Kaisong Song ,&nbsp;Yangyang Kang ,&nbsp;Chaoqun Liu ,&nbsp;Yu Hong ,&nbsp;Guodong Zhou","doi":"10.1016/j.knosys.2024.112734","DOIUrl":"10.1016/j.knosys.2024.112734","url":null,"abstract":"<div><div>Comparative Opinion Quintuple Extraction (COQE) is an essential task in sentiment analysis that entails the extraction of quintuples from comparative sentences. Each quintuple comprises a subject, an object, a shared aspect for comparison, a comparative opinion and a distinct preference. The prevalent reliance on extensively annotated datasets inherently constrains the efficiency of training. Manual data labeling is both time-consuming and labor-intensive, especially labeling quintuple data. Herein, we propose a <strong>D</strong>ual-channel <strong>T</strong>riple-to-quintuple <strong>D</strong>ata <strong>A</strong>ugmentation (<strong>DTDA</strong>) approach for the COQE task. In particular, we leverage ChatGPT to generate domain-specific triple data. Subsequently, we utilize these generated data and existing Aspect Sentiment Triplet Extraction (ASTE) data for separate preliminary fine-tuning. On this basis, we employ the two fine-tuned triple models for warm-up and construct a dual-channel quintuple model using the unabridged quintuples. We evaluate our approach on three benchmark datasets: Camera-COQE, Car-COQE and Ele-COQE. Our approach exhibits substantial improvements versus pipeline-based, joint, and T5-based baselines. Notably, the DTDA method significantly outperforms the best pipeline method, with exact match <span><math><mi>F</mi></math></span>1-score increasing by 10.32%, 8.97%, and 10.65% on Camera-COQE, Car-COQE and Ele-COQE, respectively. More importantly, our data augmentation method can adapt to any baselines. When integrated with the current SOTA UniCOQE method, it further improves performance by 0.34%, 1.65%, and 2.22%, respectively. We will make all related models and source code publicly available upon acceptance.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"307 ","pages":"Article 112734"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142698239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination 通过不确定性特征融合方法和自动加权测量组合增强人员再识别能力
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2024-11-20 DOI: 10.1016/j.knosys.2024.112737
Quang-Huy Che, Le-Chuong Nguyen, Duc-Tuan Luu, Vinh-Tiep Nguyen
{"title":"Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination","authors":"Quang-Huy Che,&nbsp;Le-Chuong Nguyen,&nbsp;Duc-Tuan Luu,&nbsp;Vinh-Tiep Nguyen","doi":"10.1016/j.knosys.2024.112737","DOIUrl":"10.1016/j.knosys.2024.112737","url":null,"abstract":"<div><div>Person re-identification (Re-ID) is a challenging task that involves identifying the same person across different camera views in surveillance systems. Current methods usually rely on features from single-camera views, which can be limiting when dealing with multiple cameras and challenges such as changing viewpoints and occlusions. In this paper, a new approach is introduced that enhances the capability of ReID models through the Uncertain Feature Fusion Method (UFFM) and Auto-weighted Measure Combination (AMC). UFFM generates multi-view features using features extracted independently from multiple images to mitigate view bias. However, relying only on similarity based on multi-view features is limited because these features ignore the details represented in single-view features. Therefore, we propose the AMC method to generate a more robust similarity measure by combining various measures. Our method significantly improves Rank@1 (Rank-1 accuracy) and Mean Average Precision (mAP) when evaluated on person re-identification datasets. Combined with the BoT Baseline on challenging datasets, we achieve impressive results, with a 7.9% improvement in Rank@1 and a 12.1% improvement in mAP on the MSMT17 dataset. On the Occluded-DukeMTMC dataset, our method increases Rank@1 by 22.0% and mAP by 18.4%.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"307 ","pages":"Article 112737"},"PeriodicalIF":7.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信