改进目标跟踪的多域卷积神经网络自注意机制

Int. J. Intell. Comput. Cybern. Pub Date : 2021-08-30 DOI:10.1108/ijicc-04-2021-0067

Jinchao Huang

{"title":"改进目标跟踪的多域卷积神经网络自注意机制","authors":"Jinchao Huang","doi":"10.1108/ijicc-04-2021-0067","DOIUrl":null,"url":null,"abstract":"PurposeMulti-domain convolutional neural network (MDCNN) model has been widely used in object recognition and tracking in the field of computer vision. However, if the objects to be tracked move rapid or the appearances of moving objects vary dramatically, the conventional MDCNN model will suffer from the model drift problem. To solve such problem in tracking rapid objects under limiting environment for MDCNN model, this paper proposed an auto-attentional mechanism-based MDCNN (AA-MDCNN) model for the rapid moving and changing objects tracking under limiting environment.Design/methodology/approachFirst, to distinguish the foreground object between background and other similar objects, the auto-attentional mechanism is used to selectively aggregate the weighted summation of all feature maps to make the similar features related to each other. Then, the bidirectional gated recurrent unit (Bi-GRU) architecture is used to integrate all the feature maps to selectively emphasize the importance of the correlated feature maps. Finally, the final feature map is obtained by fusion the above two feature maps for object tracking. In addition, a composite loss function is constructed to solve the similar but different attribute sequences tracking using conventional MDCNN model.FindingsIn order to validate the effectiveness and feasibility of the proposed AA-MDCNN model, this paper used ImageNet-Vid dataset to train the object tracking model, and the OTB-50 dataset is used to validate the AA-MDCNN tracking model. Experimental results have shown that the augmentation of auto-attentional mechanism will improve the accuracy rate 2.75% and success rate 2.41%, respectively. In addition, the authors also selected six complex tracking scenarios in OTB-50 dataset; over eleven attributes have been validated that the proposed AA-MDCNN model outperformed than the comparative models over nine attributes. In addition, except for the scenario of multi-objects moving with each other, the proposed AA-MDCNN model solved the majority rapid moving objects tracking scenarios and outperformed than the comparative models on such complex scenarios.Originality/valueThis paper introduced the auto-attentional mechanism into MDCNN model and adopted Bi-GRU architecture to extract key features. By using the proposed AA-MDCNN model, rapid object tracking under complex background, motion blur and occlusion objects has better effect, and such model is expected to be further applied to the rapid object tracking in the real world.","PeriodicalId":352072,"journal":{"name":"Int. J. Intell. Comput. Cybern.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Auto-attentional mechanism in multi-domain convolutional neural networks for improving object tracking\",\"authors\":\"Jinchao Huang\",\"doi\":\"10.1108/ijicc-04-2021-0067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"PurposeMulti-domain convolutional neural network (MDCNN) model has been widely used in object recognition and tracking in the field of computer vision. However, if the objects to be tracked move rapid or the appearances of moving objects vary dramatically, the conventional MDCNN model will suffer from the model drift problem. To solve such problem in tracking rapid objects under limiting environment for MDCNN model, this paper proposed an auto-attentional mechanism-based MDCNN (AA-MDCNN) model for the rapid moving and changing objects tracking under limiting environment.Design/methodology/approachFirst, to distinguish the foreground object between background and other similar objects, the auto-attentional mechanism is used to selectively aggregate the weighted summation of all feature maps to make the similar features related to each other. Then, the bidirectional gated recurrent unit (Bi-GRU) architecture is used to integrate all the feature maps to selectively emphasize the importance of the correlated feature maps. Finally, the final feature map is obtained by fusion the above two feature maps for object tracking. In addition, a composite loss function is constructed to solve the similar but different attribute sequences tracking using conventional MDCNN model.FindingsIn order to validate the effectiveness and feasibility of the proposed AA-MDCNN model, this paper used ImageNet-Vid dataset to train the object tracking model, and the OTB-50 dataset is used to validate the AA-MDCNN tracking model. Experimental results have shown that the augmentation of auto-attentional mechanism will improve the accuracy rate 2.75% and success rate 2.41%, respectively. In addition, the authors also selected six complex tracking scenarios in OTB-50 dataset; over eleven attributes have been validated that the proposed AA-MDCNN model outperformed than the comparative models over nine attributes. In addition, except for the scenario of multi-objects moving with each other, the proposed AA-MDCNN model solved the majority rapid moving objects tracking scenarios and outperformed than the comparative models on such complex scenarios.Originality/valueThis paper introduced the auto-attentional mechanism into MDCNN model and adopted Bi-GRU architecture to extract key features. By using the proposed AA-MDCNN model, rapid object tracking under complex background, motion blur and occlusion objects has better effect, and such model is expected to be further applied to the rapid object tracking in the real world.\",\"PeriodicalId\":352072,\"journal\":{\"name\":\"Int. J. Intell. Comput. Cybern.\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Intell. Comput. Cybern.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1108/ijicc-04-2021-0067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Intell. Comput. Cybern.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/ijicc-04-2021-0067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

目的多域卷积神经网络(MDCNN)模型在计算机视觉领域的目标识别和跟踪中得到了广泛的应用。但是，如果被跟踪对象移动迅速或运动对象的外观变化很大，传统的MDCNN模型就会出现模型漂移问题。为了解决MDCNN模型在受限环境下快速目标跟踪的问题，本文提出了一种基于自注意机制的MDCNN (AA-MDCNN)模型，用于受限环境下快速运动和变化目标跟踪。设计/方法/方法首先，为了区分前景目标与背景和其他相似目标，利用自注意机制对所有特征映射的加权求和进行选择性聚合，使相似特征相互关联;然后，采用双向门控循环单元(Bi-GRU)架构对所有特征映射进行整合，选择性地强调相关特征映射的重要性;最后，将上述两个特征图融合得到最终的特征图，用于目标跟踪。此外，构造了一个复合损失函数来解决传统MDCNN模型中相似但不同属性序列的跟踪问题。为了验证本文提出的AA-MDCNN模型的有效性和可行性，本文使用ImageNet-Vid数据集训练目标跟踪模型，并使用OTB-50数据集对AA-MDCNN跟踪模型进行验证。实验结果表明，增强自注意机制后，识别正确率和成功率分别提高了2.75%和2.41%。此外，作者还在OTB-50数据集中选择了6个复杂的跟踪场景;超过11个属性的验证表明，所提出的AA-MDCNN模型优于超过9个属性的比较模型。此外，除了多目标相互运动的场景外，本文提出的AA-MDCNN模型解决了大多数快速运动目标跟踪场景，在此类复杂场景上优于对比模型。本文将自注意机制引入MDCNN模型，采用Bi-GRU架构提取关键特征。采用本文提出的AA-MDCNN模型，在复杂背景、运动模糊和遮挡目标下的快速目标跟踪效果较好，该模型有望进一步应用于现实世界中的快速目标跟踪。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Auto-attentional mechanism in multi-domain convolutional neural networks for improving object tracking

PurposeMulti-domain convolutional neural network (MDCNN) model has been widely used in object recognition and tracking in the field of computer vision. However, if the objects to be tracked move rapid or the appearances of moving objects vary dramatically, the conventional MDCNN model will suffer from the model drift problem. To solve such problem in tracking rapid objects under limiting environment for MDCNN model, this paper proposed an auto-attentional mechanism-based MDCNN (AA-MDCNN) model for the rapid moving and changing objects tracking under limiting environment.Design/methodology/approachFirst, to distinguish the foreground object between background and other similar objects, the auto-attentional mechanism is used to selectively aggregate the weighted summation of all feature maps to make the similar features related to each other. Then, the bidirectional gated recurrent unit (Bi-GRU) architecture is used to integrate all the feature maps to selectively emphasize the importance of the correlated feature maps. Finally, the final feature map is obtained by fusion the above two feature maps for object tracking. In addition, a composite loss function is constructed to solve the similar but different attribute sequences tracking using conventional MDCNN model.FindingsIn order to validate the effectiveness and feasibility of the proposed AA-MDCNN model, this paper used ImageNet-Vid dataset to train the object tracking model, and the OTB-50 dataset is used to validate the AA-MDCNN tracking model. Experimental results have shown that the augmentation of auto-attentional mechanism will improve the accuracy rate 2.75% and success rate 2.41%, respectively. In addition, the authors also selected six complex tracking scenarios in OTB-50 dataset; over eleven attributes have been validated that the proposed AA-MDCNN model outperformed than the comparative models over nine attributes. In addition, except for the scenario of multi-objects moving with each other, the proposed AA-MDCNN model solved the majority rapid moving objects tracking scenarios and outperformed than the comparative models on such complex scenarios.Originality/valueThis paper introduced the auto-attentional mechanism into MDCNN model and adopted Bi-GRU architecture to extract key features. By using the proposed AA-MDCNN model, rapid object tracking under complex background, motion blur and occlusion objects has better effect, and such model is expected to be further applied to the rapid object tracking in the real world.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Int. J. Intell. Comput. Cybern.

自引率

0.00%

发文量