{"title":"基于双侧头部区域的卷积神经网络:增量少拍物体检测的统一方法","authors":"Yiting Li;Haiyue Zhu;Sichao Tian;Jun Ma;Cheng Xiang;Prahlad Vadakkepat","doi":"10.1109/TAI.2024.3381919","DOIUrl":null,"url":null,"abstract":"Practical object detection systems are highly desired to be open-ended for learning on frequently evolved datasets. Moreover, learning with little supervision further adds flexibility for real-world applications such as autonomous driving and robotics, where large-scale datasets could be prohibitive or expensive to obtain. However, continual adaption with small training examples often results in catastrophic forgetting and dramatic overfitting. To address such issues, a compositional learning system is proposed to enable effective incremental object detection from nonstationary and few-shot data streams. First of all, a novel bilateral–head framework is proposed to decouple the representation learning of base (pretrained) and novel (few-shot) classes into separate embedding spaces, which takes care of novel concept integration and base knowledge retention simultaneously. Moreover, to enhance learning stability, a robust parameter updating rule, i.e., recall and progress mechanism, is carried out to constrain the optimization trajectory of sequential model adaption. Beyond that, to enforce intertask class discrimination with little memory burden, we present a between-class regularization method that expands the decision space of few-shot classes for constructing unbiased feature representation. Final, we deeply investigate the incomplete annotation issue considering the realistic scenario of incremental few-shot object detection (iFSOD) and propose a semisupervised object labeling mechanism to accurately recover the missing annotations for previously encountered classes, which further enhances the robustness of the target detector to counteract catastrophic forgetting. Extensive experiments conducted on both Pascal visual object classes dataset (VOC) and microsoft common objects in context dataset (MS-COCO) datasets demonstrate the effectiveness of our method.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 9","pages":"4376-4390"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bilateral-Head Region-Based Convolutional Neural Networks: A Unified Approach for Incremental Few-Shot Object Detection\",\"authors\":\"Yiting Li;Haiyue Zhu;Sichao Tian;Jun Ma;Cheng Xiang;Prahlad Vadakkepat\",\"doi\":\"10.1109/TAI.2024.3381919\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Practical object detection systems are highly desired to be open-ended for learning on frequently evolved datasets. Moreover, learning with little supervision further adds flexibility for real-world applications such as autonomous driving and robotics, where large-scale datasets could be prohibitive or expensive to obtain. However, continual adaption with small training examples often results in catastrophic forgetting and dramatic overfitting. To address such issues, a compositional learning system is proposed to enable effective incremental object detection from nonstationary and few-shot data streams. First of all, a novel bilateral–head framework is proposed to decouple the representation learning of base (pretrained) and novel (few-shot) classes into separate embedding spaces, which takes care of novel concept integration and base knowledge retention simultaneously. Moreover, to enhance learning stability, a robust parameter updating rule, i.e., recall and progress mechanism, is carried out to constrain the optimization trajectory of sequential model adaption. Beyond that, to enforce intertask class discrimination with little memory burden, we present a between-class regularization method that expands the decision space of few-shot classes for constructing unbiased feature representation. Final, we deeply investigate the incomplete annotation issue considering the realistic scenario of incremental few-shot object detection (iFSOD) and propose a semisupervised object labeling mechanism to accurately recover the missing annotations for previously encountered classes, which further enhances the robustness of the target detector to counteract catastrophic forgetting. Extensive experiments conducted on both Pascal visual object classes dataset (VOC) and microsoft common objects in context dataset (MS-COCO) datasets demonstrate the effectiveness of our method.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"5 9\",\"pages\":\"4376-4390\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10480289/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10480289/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Bilateral-Head Region-Based Convolutional Neural Networks: A Unified Approach for Incremental Few-Shot Object Detection
Practical object detection systems are highly desired to be open-ended for learning on frequently evolved datasets. Moreover, learning with little supervision further adds flexibility for real-world applications such as autonomous driving and robotics, where large-scale datasets could be prohibitive or expensive to obtain. However, continual adaption with small training examples often results in catastrophic forgetting and dramatic overfitting. To address such issues, a compositional learning system is proposed to enable effective incremental object detection from nonstationary and few-shot data streams. First of all, a novel bilateral–head framework is proposed to decouple the representation learning of base (pretrained) and novel (few-shot) classes into separate embedding spaces, which takes care of novel concept integration and base knowledge retention simultaneously. Moreover, to enhance learning stability, a robust parameter updating rule, i.e., recall and progress mechanism, is carried out to constrain the optimization trajectory of sequential model adaption. Beyond that, to enforce intertask class discrimination with little memory burden, we present a between-class regularization method that expands the decision space of few-shot classes for constructing unbiased feature representation. Final, we deeply investigate the incomplete annotation issue considering the realistic scenario of incremental few-shot object detection (iFSOD) and propose a semisupervised object labeling mechanism to accurately recover the missing annotations for previously encountered classes, which further enhances the robustness of the target detector to counteract catastrophic forgetting. Extensive experiments conducted on both Pascal visual object classes dataset (VOC) and microsoft common objects in context dataset (MS-COCO) datasets demonstrate the effectiveness of our method.