LCNet: Location Combination for Object Detection

Proceedings of the 6th International Conference on Digital Signal Processing Pub Date : 2022-02-25 DOI:10.1145/3529570.3529596

Xin Yi, Bo Ma

{"title":"LCNet: Location Combination for Object Detection","authors":"Xin Yi, Bo Ma","doi":"10.1145/3529570.3529596","DOIUrl":null,"url":null,"abstract":"Object detection is a widely studied task in the computer vision field. In recent years, some milestone approaches and solid benchmarks have been proposed, which significantly boosts the development of related researches. The previous object detection methods follow a paradigm: the classification head and the regression head share the same feature extracted by the backbone network. In this paper, we revisit this paradigm for two-stage detectors and prove that the regression head can achieve better results by using the local features. In our proposed Location Combination Networks (LCNet), we extract the effective region of the feature in a Laplace way, and we introduce auxiliary confidence gain loss, Intersection over Union (IoU) gain loss, and distribution loss to guide its convergence. In the classification head, we combine these local features into the global feature for better classification. In the regression head, by ranking these effective regions in the spatial dimension, we can select the local features closest to each foreground boundary and use the selected features to predict the offset of each foreground boundary. Finally, we combine the locations of the four boundaries to obtain the final bounding box prediction. Extensive experimental results on the MS COCO benchmark validate the effectiveness of our proposed method.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Digital Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529570.3529596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Object detection is a widely studied task in the computer vision field. In recent years, some milestone approaches and solid benchmarks have been proposed, which significantly boosts the development of related researches. The previous object detection methods follow a paradigm: the classification head and the regression head share the same feature extracted by the backbone network. In this paper, we revisit this paradigm for two-stage detectors and prove that the regression head can achieve better results by using the local features. In our proposed Location Combination Networks (LCNet), we extract the effective region of the feature in a Laplace way, and we introduce auxiliary confidence gain loss, Intersection over Union (IoU) gain loss, and distribution loss to guide its convergence. In the classification head, we combine these local features into the global feature for better classification. In the regression head, by ranking these effective regions in the spatial dimension, we can select the local features closest to each foreground boundary and use the selected features to predict the offset of each foreground boundary. Finally, we combine the locations of the four boundaries to obtain the final bounding box prediction. Extensive experimental results on the MS COCO benchmark validate the effectiveness of our proposed method.

查看原文本刊更多论文

LCNet:用于目标检测的位置组合

目标检测是计算机视觉领域一个被广泛研究的课题。近年来，一些里程碑式的方法和坚实的基准被提出，极大地推动了相关研究的发展。以前的目标检测方法遵循一个范式:分类头和回归头共享骨干网提取的相同特征。在本文中，我们重新审视了两阶段检测器的这种范式，并证明了回归头通过使用局部特征可以获得更好的结果。在我们提出的位置组合网络(LCNet)中，我们以拉普拉斯方法提取特征的有效区域，并引入辅助的置信度增益损失、交联增益损失和分布损失来指导其收敛。在分类头中，我们将这些局部特征组合成全局特征，以便更好地分类。在回归头中，通过对这些有效区域在空间维度上的排序，我们可以选择最接近每个前景边界的局部特征，并使用所选择的特征来预测每个前景边界的偏移量。最后，我们将四个边界的位置组合起来，得到最终的边界盒预测。在MS COCO基准上的大量实验结果验证了我们提出的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 6th International Conference on Digital Signal Processing

自引率

0.00%

发文量