基于混合变压器的监视场景人物属性识别

S. Abhilash, Venu Madhav Nookala
{"title":"基于混合变压器的监视场景人物属性识别","authors":"S. Abhilash, Venu Madhav Nookala","doi":"10.1109/DISCOVER55800.2022.9974664","DOIUrl":null,"url":null,"abstract":"Recognition of person attributes has been an emerging research topic and also have drawn extensive attention in the area of video surveillance. It is a very important and challenging task to notice the regions of a person’s attributes. Existing methods are applied to primary convolutional neural networks to localize the region related to person attribute. In this paper we adopted a co-scale Conv-Attentional image transformer to decipher the most discriminative attribute and region at multiple levels.Serial and parallel building blocks are introduced wherein serial blocks consists of conv-attention and feed forward network and parallel blocks have two strategies which are attention with feature interpolation and direct cross layer attention. From our results we observe that hybrid transformers are better than pure transformers. Extensive experimental result shows that proposed hybrid method outperforms the existing methods on four different personal attribute datasets i.e., RapV2, RapVl, PETA, PA100K.","PeriodicalId":264177,"journal":{"name":"2022 International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics ( DISCOVER)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Person Attribute Recognition using Hybrid Transformers for Surveillance Scenarios\",\"authors\":\"S. Abhilash, Venu Madhav Nookala\",\"doi\":\"10.1109/DISCOVER55800.2022.9974664\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recognition of person attributes has been an emerging research topic and also have drawn extensive attention in the area of video surveillance. It is a very important and challenging task to notice the regions of a person’s attributes. Existing methods are applied to primary convolutional neural networks to localize the region related to person attribute. In this paper we adopted a co-scale Conv-Attentional image transformer to decipher the most discriminative attribute and region at multiple levels.Serial and parallel building blocks are introduced wherein serial blocks consists of conv-attention and feed forward network and parallel blocks have two strategies which are attention with feature interpolation and direct cross layer attention. From our results we observe that hybrid transformers are better than pure transformers. Extensive experimental result shows that proposed hybrid method outperforms the existing methods on four different personal attribute datasets i.e., RapV2, RapVl, PETA, PA100K.\",\"PeriodicalId\":264177,\"journal\":{\"name\":\"2022 International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics ( DISCOVER)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics ( DISCOVER)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DISCOVER55800.2022.9974664\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics ( DISCOVER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISCOVER55800.2022.9974664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

人的属性识别是一个新兴的研究课题,在视频监控领域受到了广泛的关注。注意一个人的属性区域是一项非常重要和具有挑战性的任务。将已有的方法应用到初级卷积神经网络中,对人物属性相关区域进行定位。在本文中,我们采用了一种协尺度的逆注意图像转换器,在多个层次上解密最具判别性的属性和区域。介绍了串行和并行模块,其中串行模块由逆向注意和前馈网络组成,并行模块有特征插值注意和直接跨层注意两种策略。结果表明,混合变压器的性能优于纯变压器。大量实验结果表明,该方法在RapV2、RapVl、PETA、PA100K四种不同的个人属性数据集上优于现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Person Attribute Recognition using Hybrid Transformers for Surveillance Scenarios
Recognition of person attributes has been an emerging research topic and also have drawn extensive attention in the area of video surveillance. It is a very important and challenging task to notice the regions of a person’s attributes. Existing methods are applied to primary convolutional neural networks to localize the region related to person attribute. In this paper we adopted a co-scale Conv-Attentional image transformer to decipher the most discriminative attribute and region at multiple levels.Serial and parallel building blocks are introduced wherein serial blocks consists of conv-attention and feed forward network and parallel blocks have two strategies which are attention with feature interpolation and direct cross layer attention. From our results we observe that hybrid transformers are better than pure transformers. Extensive experimental result shows that proposed hybrid method outperforms the existing methods on four different personal attribute datasets i.e., RapV2, RapVl, PETA, PA100K.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信