IET Computer Vision最新文献

筛选
英文 中文
A survey on person and vehicle re-identification 人车再识别调查
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-10-28 DOI: 10.1049/cvi2.12316
Zhaofa Wang, Liyang Wang, Zhiping Shi, Miaomiao Zhang, Qichuan Geng, Na Jiang
{"title":"A survey on person and vehicle re-identification","authors":"Zhaofa Wang,&nbsp;Liyang Wang,&nbsp;Zhiping Shi,&nbsp;Miaomiao Zhang,&nbsp;Qichuan Geng,&nbsp;Na Jiang","doi":"10.1049/cvi2.12316","DOIUrl":"https://doi.org/10.1049/cvi2.12316","url":null,"abstract":"<p>Person/vehicle re-identification aims to use technologies such as cross-camera retrieval to associate the same person (same vehicle) in the surveillance videos at different locations, different times, and images captured by different cameras so as to achieve cross-surveillance image matching, person retrieval and trajectory tracking. It plays an extremely important role in the fields of intelligent security, criminal investigation etc. In recent years, the rapid development of deep learning technology has significantly propelled the advancement of re-identification (Re-ID) technology. An increasing number of technical methods have emerged, aiming to enhance Re-ID performance. This paper summarises four popular research areas in the current field of re-identification, focusing on the current research hotspots. These areas include the multi-task learning domain, the generalisation learning domain, the cross-modality domain, and the optimisation learning domain. Specifically, the paper analyses various challenges faced within these domains and elaborates on different deep learning frameworks and networks that address these challenges. A comparative analysis of re-identification tasks from various classification perspectives is provided, introducing mainstream research directions and current achievements. Finally, insights into future development trends are presented.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1235-1268"},"PeriodicalIF":1.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12316","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Occluded object 6D pose estimation using foreground probability compensation 前景概率补偿被遮挡物体6D姿态估计
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-10-17 DOI: 10.1049/cvi2.12314
Meihui Ren, Junying Jia, Xin Lu
{"title":"Occluded object 6D pose estimation using foreground probability compensation","authors":"Meihui Ren,&nbsp;Junying Jia,&nbsp;Xin Lu","doi":"10.1049/cvi2.12314","DOIUrl":"https://doi.org/10.1049/cvi2.12314","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <p>6D object pose estimation usually refers to acquiring the 6D pose information of 3D objects in the sensor coordinate system using computer vision techniques. However, the task faces numerous challenges due to the complexity of natural scenes. One of the most significant challenges is occlusion, which is an unavoidable situation in 3D scenes and poses a significant obstacle in real-world applications. To tackle this issue, we propose a novel 6D pose estimation algorithm based on RGB-D images, aiming for enhanced robustness in occluded environments. Our approach follows the basic architecture of keypoint-based pose estimation algorithms. To better leverage complementary information of RGB-D data, we introduce a novel foreground probability-guided sampling strategy at the network's input stage. This strategy mitigates the sampling ratio imbalance between foreground and background points due to smaller foreground objects in occluded environments. Moreover, considering the impact of occlusion on semantic segmentation networks, we introduce a new object segmentation module. This module utilises traditional image processing techniques to compensate for severe semantic segmentation errors of deep learning networks. We evaluate our algorithm using the Occlusion LineMOD public dataset. Experimental results demonstrate that our method is more robust in occlusion environments compared to existing state-of-the-art algorithms. It maintains stable performance even in scenarios with no or low occlusion.</p>\u0000 </section>\u0000 </div>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1325-1337"},"PeriodicalIF":1.5,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12314","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143252849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time semantic segmentation network for crops and weeds based on multi-branch structure 基于多分支结构的农作物和杂草实时语义分割网络
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-10-01 DOI: 10.1049/cvi2.12311
Yufan Liu, Muhua Liu, Xuhui Zhao, Junlong Zhu, Lin Wang, Hao Ma, Mingchuan Zhang
{"title":"Real-time semantic segmentation network for crops and weeds based on multi-branch structure","authors":"Yufan Liu,&nbsp;Muhua Liu,&nbsp;Xuhui Zhao,&nbsp;Junlong Zhu,&nbsp;Lin Wang,&nbsp;Hao Ma,&nbsp;Mingchuan Zhang","doi":"10.1049/cvi2.12311","DOIUrl":"https://doi.org/10.1049/cvi2.12311","url":null,"abstract":"<p>Weed recognition is an inevitable problem in smart agriculture, and to realise efficient weed recognition, complex background, insufficient feature information, varying target sizes and overlapping crops and weeds are the main problems to be solved. To address these problems, the authors propose a real-time semantic segmentation network based on a multi-branch structure for recognising crops and weeds. First, a new backbone network for capturing feature information between crops and weeds of different sizes is constructed. Second, the authors propose a weight refinement fusion (WRF) module to enhance the feature extraction ability of crops and weeds and reduce the interference caused by the complex background. Finally, a Semantic Guided Fusion is devised to enhance the interaction of information between crops and weeds and reduce the interference caused by overlapping goals. The experimental results demonstrate that the proposed network can balance speed and accuracy. Specifically, the 0.713 Mean IoU (MIoU), 0.802 MIoU, 0.746 MIoU and 0.906 MIoU can be achieved on the sugar beet (BoniRob) dataset, synthetic BoniRob dataset, CWFID dataset and self-labelled wheat dataset, respectively.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1313-1324"},"PeriodicalIF":1.5,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12311","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143248083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging modality-specific and shared features for RGB-T salient object detection 利用特定于模态的共享特性进行RGB-T显著对象检测
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-09-25 DOI: 10.1049/cvi2.12307
Shuo Wang, Gang Yang, Qiqi Xu, Xun Dai
{"title":"Leveraging modality-specific and shared features for RGB-T salient object detection","authors":"Shuo Wang,&nbsp;Gang Yang,&nbsp;Qiqi Xu,&nbsp;Xun Dai","doi":"10.1049/cvi2.12307","DOIUrl":"https://doi.org/10.1049/cvi2.12307","url":null,"abstract":"<p>Most of the existing RGB-T salient object detection methods are usually based on dual-stream encoding single-stream decoding network architecture. These models always rely on the quality of fusion features, which often focus on modality-shared features and overlook modality-specific features, thus failing to fully utilise the rich information contained in multi-modality data. To this end, a modality separate tri-stream net (MSTNet), which consists of a tri-stream encoding (TSE) structure and a tri-stream decoding (TSD) structure is proposed. The TSE explicitly separates and extracts the modality-shared and modality-specific features to improve the utilisation of multi-modality data. In addition, based on the hybrid-attention and cross-attention mechanism, we design an enhanced complementary fusion module (ECF), which fully considers the complementarity between the features to be fused and realises high-quality feature fusion. Furthermore, in TSD, the quality of uni-modality features is ensured under the constraint of supervision. Finally, to make full use of the rich multi-level and multi-scale decoding features contained in TSD, the authors design the adaptive multi-scale decoding module and the multi-stream feature aggregation module to improve the decoding capability. Extensive experiments on three public datasets show that the MSTNet outperforms 14 state-of-the-art methods, demonstrating that this method can extract and utilise the multi-modality information more adequately and extract more complete and rich features, thus improving the model's performance. The code will be released at https://github.com/JOOOOKII/MSTNet.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1285-1299"},"PeriodicalIF":1.5,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12307","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPANet: Spatial perceptual activation network for camouflaged object detection 用于伪装目标检测的空间感知激活网络
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-09-18 DOI: 10.1049/cvi2.12310
Jianhao Zhang, Gang Yang, Xun Dai, Pengyu Yang
{"title":"SPANet: Spatial perceptual activation network for camouflaged object detection","authors":"Jianhao Zhang,&nbsp;Gang Yang,&nbsp;Xun Dai,&nbsp;Pengyu Yang","doi":"10.1049/cvi2.12310","DOIUrl":"https://doi.org/10.1049/cvi2.12310","url":null,"abstract":"<p>Camouflaged object detection (COD) aims to segment objects embedded in the environment from the background. Most existing methods are easily affected by background interference in cluttered environments and cannot accurately locate camouflage areas, resulting in over-segmentation or incomplete segmentation structures. To effectively improve the performance of COD, we propose a spatial perceptual activation network (SPANet). SPANet extracts the spatial positional relationship between each object in the scene by activating spatial perception and uses it as global information to guide segmentation. It mainly consists of three modules: perceptual activation module (PAM), feature inference module (FIM), and interaction recovery module (IRM). Specifically, the authors design a PAM to model the positional relationship between the camouflaged object and the surrounding environment to obtain semantic correlation information. Then, a FIM that can effectively combine correlation information to suppress background interference and re-encode to generate multi-scale features is proposed. In addition, to further fuse multi-scale features, an IRM to mine the complementary information and differences between features at different scales is designed. Extensive experimental results on four widely used benchmark datasets (i.e. CAMO, CHAMELEON, COD10K, and NC4K) show that the authors’ method outperforms 13 state-of-the-art methods.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1300-1312"},"PeriodicalIF":1.5,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12310","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143252909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SRL-ProtoNet: Self-supervised representation learning for few-shot remote sensing scene classification SRL-ProtoNet:用于少镜头遥感场景分类的自监督表示学习
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-09-02 DOI: 10.1049/cvi2.12304
Bing Liu, Hongwei Zhao, Jiao Li, Yansheng Gao, Jianrong Zhang
{"title":"SRL-ProtoNet: Self-supervised representation learning for few-shot remote sensing scene classification","authors":"Bing Liu,&nbsp;Hongwei Zhao,&nbsp;Jiao Li,&nbsp;Yansheng Gao,&nbsp;Jianrong Zhang","doi":"10.1049/cvi2.12304","DOIUrl":"https://doi.org/10.1049/cvi2.12304","url":null,"abstract":"<p>Using a deep learning method to classify a large amount of labelled remote sensing scene data produces good performance. However, it is challenging for deep learning based methods to generalise to classification tasks with limited data. Few-shot learning allows neural networks to classify unseen categories when confronted with a handful of labelled data. Currently, episodic tasks based on meta-learning can effectively complete few-shot classification, and training an encoder that can conduct representation learning has become an important component of few-shot learning. An end-to-end few-shot remote sensing scene classification model based on ProtoNet and self-supervised learning is proposed. The authors design the Pre-prototype for a more discrete feature space and better integration with self-supervised learning, and also propose the ProtoMixer for higher quality prototypes with a global receptive field. The authors’ method outperforms the existing state-of-the-art self-supervised based methods on three widely used benchmark datasets: UC-Merced, NWPU-RESISC45, and AID. Compare with previous state-of-the-art performance. For the one-shot setting, this method improves by 1.21%, 2.36%, and 0.84% in AID, UC-Merced, and NWPU-RESISC45, respectively. For the five-shot setting, this method surpasses by 0.85%, 2.79%, and 0.74% in the AID, UC-Merced, and NWPU-RESISC45, respectively.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 7","pages":"1034-1042"},"PeriodicalIF":1.5,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12304","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142563035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving unsupervised pedestrian re-identification with enhanced feature representation and robust clustering 利用增强特征表示和鲁棒聚类改进无监督行人再识别
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-08-26 DOI: 10.1049/cvi2.12309
Jiang Luo, Lingjun Liu
{"title":"Improving unsupervised pedestrian re-identification with enhanced feature representation and robust clustering","authors":"Jiang Luo,&nbsp;Lingjun Liu","doi":"10.1049/cvi2.12309","DOIUrl":"https://doi.org/10.1049/cvi2.12309","url":null,"abstract":"<p>Pedestrian re-identification (re-ID) is an important research direction in computer vision, with extensive applications in pattern recognition and monitoring systems. Due to uneven data distribution, and the need to solve clustering standards and similarity evaluation problems, the performance of unsupervised methods is limited. To address these issues, an improved unsupervised re-ID method, called Enhanced Feature Representation and Robust Clustering (EFRRC), which combines EFRRC is proposed. First, a relation network that considers the relations between each part of the pedestrian's body and other parts is introduced, thereby obtaining more discriminative feature representations. The network makes the feature at the single-part level also contain partial information of other body parts, making it more discriminative. A global contrastive pooling (GCP) module is introduced to obtain the global features of the image. Second, a dispersion-based clustering method, which can effectively evaluate the quality of clustering and discover potential patterns in the data is designed. This approach considers a wider context of sample-level pairwise relationships for robust cluster affinity assessment. It effectively addresses challenges posed by imbalanced data distributions in complex situations. The above structures are connected through a clustering contrastive learning framework, which not only improves the discriminative power of features and the accuracy of clustering, but also solves the problem of inconsistent clustering updates. Experimental results on three public datasets demonstrate the superiority of our method over existing unsupervised re-ID methods.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1097-1111"},"PeriodicalIF":1.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12309","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing semi-supervised contrastive learning through saliency map for diabetic retinopathy grading 通过显著性图增强糖尿病视网膜病变分级的半监督对比学习
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-08-26 DOI: 10.1049/cvi2.12308
Jiacheng Zhang, Rong Jin, Wenqiang Liu
{"title":"Enhancing semi-supervised contrastive learning through saliency map for diabetic retinopathy grading","authors":"Jiacheng Zhang,&nbsp;Rong Jin,&nbsp;Wenqiang Liu","doi":"10.1049/cvi2.12308","DOIUrl":"https://doi.org/10.1049/cvi2.12308","url":null,"abstract":"<p>Diabetic retinopathy (DR) is a severe ophthalmic condition that can lead to blindness if not diagnosed and provided timely treatment. Hence, the development of efficient automated DR grading systems is crucial for early screening and treatment. Although progress has been made in DR detection using deep learning techniques, these methods still face challenges in handling the complexity of DR lesion characteristics and the nuances in grading criteria. Moreover, the performance of these algorithms is hampered by the scarcity of large-scale, high-quality annotated data. An innovative semi-supervised fundus image DR grading framework is proposed, employing a saliency estimation map to bolster the model's perception of fundus structures, thereby improving the differentiation between lesions and healthy regions. By integrating semi-supervised and contrastive learning, the model's ability to recognise inter-class and intra-class variations in DR grading is enhanced, allowing for precise discrimination of various lesion features. Experiments conducted on publicly available DR grading datasets, such as EyePACS and Messidor, have validated the effectiveness of our proposed method. Specifically, our approach outperforms the state of the art on the kappa metric by 0.8% on the full EyePACS dataset and by 3.2% on a 10% subset of EyePACS, demonstrating its superiority over previous methodologies. The authors’ code is publicly available at https://github.com/500ZhangJC/SCL-SEM-framework-for-DR-Grading.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1127-1137"},"PeriodicalIF":1.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12308","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Balanced parametric body prior for implicit clothed human reconstruction from a monocular RGB 根据单目 RGB 重建隐式衣着人体的平衡参数人体先验图
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-08-25 DOI: 10.1049/cvi2.12306
Rong Xue, Jiefeng Li, Cewu Lu
{"title":"Balanced parametric body prior for implicit clothed human reconstruction from a monocular RGB","authors":"Rong Xue,&nbsp;Jiefeng Li,&nbsp;Cewu Lu","doi":"10.1049/cvi2.12306","DOIUrl":"https://doi.org/10.1049/cvi2.12306","url":null,"abstract":"<p>The authors study the problem of reconstructing detailed 3D human surfaces in various poses and clothing from images. The parametric human body allows accurate 3D clothed human reconstruction. However, the offset of large and loose clothing from the inferred parametric body mesh confines the generalisation of the existing parametric body-based methods. A distinctive method that simultaneously generalises well to unseen poses and unseen clothing is proposed. The authors first discover the unbalanced nature of existing implicit function-based methods. To address this issue, the authors propose to synthesise the balanced training samples with a new dependency coefficient in training. The dependency coefficient can tell the network whether the prior from the parametric body model is reliable. The authors then design a novel positional embedding-based attenuation strategy to incorporate the dependency coefficient into the implicit function (IF) network. Comprehensive experiments are conducted on the CAPE dataset to study the effectiveness of the authors’ approach. The proposed method significantly surpasses state-of-the-art approaches and generalises well on unseen poses and clothing. As an illustrative example, the proposed method improves the Chamfer Distance Error and Normal Error by 38.2% and 57.6%.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 7","pages":"1057-1067"},"PeriodicalIF":1.5,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12306","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142563028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Social-ATPGNN: Prediction of multi-modal pedestrian trajectory of non-homogeneous social interaction 社交-ATPGNN:预测非同质社交互动的多模式行人轨迹
IF 1.5 4区 计算机科学
IET Computer Vision Pub Date : 2024-08-21 DOI: 10.1049/cvi2.12286
Kehao Wang, Han Zou
{"title":"Social-ATPGNN: Prediction of multi-modal pedestrian trajectory of non-homogeneous social interaction","authors":"Kehao Wang,&nbsp;Han Zou","doi":"10.1049/cvi2.12286","DOIUrl":"https://doi.org/10.1049/cvi2.12286","url":null,"abstract":"<p>With the development of automatic driving and path planning technology, predicting the moving trajectory of pedestrians in dynamic scenes has become one of key and urgent technical problems. However, most of the existing techniques regard all pedestrians in the scene as equally important influence on the predicted pedestrian's trajectory, and the existing methods which use sequence-based time-series generative models to obtain the predicted trajectories, do not allow for parallel computation, it will introduce a significant computational overhead. A new social trajectory prediction network, Social-ATPGNN which integrates both temporal information and spatial one based on ATPGNN is proposed. In space domain, the pedestrians in the predicted scene are formed into an undirected and non fully connected graph, which solves the problem of homogenisation of pedestrian relationships, then, the spatial interaction between pedestrians is encoded to improve the accuracy of modelling pedestrian social consciousness. After acquiring high-level spatial data, the method uses Temporal Convolutional Network which could perform parallel calculations to capture the correlation of time series of pedestrian trajectories. Through a large number of experiments, the proposed model shows the superiority over the latest models on various pedestrian trajectory datasets.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 7","pages":"907-921"},"PeriodicalIF":1.5,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12286","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142563036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信