LCNet:用于目标检测的位置组合

Xin Yi, Bo Ma
{"title":"LCNet:用于目标检测的位置组合","authors":"Xin Yi, Bo Ma","doi":"10.1145/3529570.3529596","DOIUrl":null,"url":null,"abstract":"Object detection is a widely studied task in the computer vision field. In recent years, some milestone approaches and solid benchmarks have been proposed, which significantly boosts the development of related researches. The previous object detection methods follow a paradigm: the classification head and the regression head share the same feature extracted by the backbone network. In this paper, we revisit this paradigm for two-stage detectors and prove that the regression head can achieve better results by using the local features. In our proposed Location Combination Networks (LCNet), we extract the effective region of the feature in a Laplace way, and we introduce auxiliary confidence gain loss, Intersection over Union (IoU) gain loss, and distribution loss to guide its convergence. In the classification head, we combine these local features into the global feature for better classification. In the regression head, by ranking these effective regions in the spatial dimension, we can select the local features closest to each foreground boundary and use the selected features to predict the offset of each foreground boundary. Finally, we combine the locations of the four boundaries to obtain the final bounding box prediction. Extensive experimental results on the MS COCO benchmark validate the effectiveness of our proposed method.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LCNet: Location Combination for Object Detection\",\"authors\":\"Xin Yi, Bo Ma\",\"doi\":\"10.1145/3529570.3529596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Object detection is a widely studied task in the computer vision field. In recent years, some milestone approaches and solid benchmarks have been proposed, which significantly boosts the development of related researches. The previous object detection methods follow a paradigm: the classification head and the regression head share the same feature extracted by the backbone network. In this paper, we revisit this paradigm for two-stage detectors and prove that the regression head can achieve better results by using the local features. In our proposed Location Combination Networks (LCNet), we extract the effective region of the feature in a Laplace way, and we introduce auxiliary confidence gain loss, Intersection over Union (IoU) gain loss, and distribution loss to guide its convergence. In the classification head, we combine these local features into the global feature for better classification. In the regression head, by ranking these effective regions in the spatial dimension, we can select the local features closest to each foreground boundary and use the selected features to predict the offset of each foreground boundary. Finally, we combine the locations of the four boundaries to obtain the final bounding box prediction. Extensive experimental results on the MS COCO benchmark validate the effectiveness of our proposed method.\",\"PeriodicalId\":430367,\"journal\":{\"name\":\"Proceedings of the 6th International Conference on Digital Signal Processing\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 6th International Conference on Digital Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3529570.3529596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Digital Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529570.3529596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目标检测是计算机视觉领域一个被广泛研究的课题。近年来,一些里程碑式的方法和坚实的基准被提出,极大地推动了相关研究的发展。以前的目标检测方法遵循一个范式:分类头和回归头共享骨干网提取的相同特征。在本文中,我们重新审视了两阶段检测器的这种范式,并证明了回归头通过使用局部特征可以获得更好的结果。在我们提出的位置组合网络(LCNet)中,我们以拉普拉斯方法提取特征的有效区域,并引入辅助的置信度增益损失、交联增益损失和分布损失来指导其收敛。在分类头中,我们将这些局部特征组合成全局特征,以便更好地分类。在回归头中,通过对这些有效区域在空间维度上的排序,我们可以选择最接近每个前景边界的局部特征,并使用所选择的特征来预测每个前景边界的偏移量。最后,我们将四个边界的位置组合起来,得到最终的边界盒预测。在MS COCO基准上的大量实验结果验证了我们提出的方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
LCNet: Location Combination for Object Detection
Object detection is a widely studied task in the computer vision field. In recent years, some milestone approaches and solid benchmarks have been proposed, which significantly boosts the development of related researches. The previous object detection methods follow a paradigm: the classification head and the regression head share the same feature extracted by the backbone network. In this paper, we revisit this paradigm for two-stage detectors and prove that the regression head can achieve better results by using the local features. In our proposed Location Combination Networks (LCNet), we extract the effective region of the feature in a Laplace way, and we introduce auxiliary confidence gain loss, Intersection over Union (IoU) gain loss, and distribution loss to guide its convergence. In the classification head, we combine these local features into the global feature for better classification. In the regression head, by ranking these effective regions in the spatial dimension, we can select the local features closest to each foreground boundary and use the selected features to predict the offset of each foreground boundary. Finally, we combine the locations of the four boundaries to obtain the final bounding box prediction. Extensive experimental results on the MS COCO benchmark validate the effectiveness of our proposed method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信