Haifeng Sima, Manyang Wang, Lanlan Liu, Yu-dong Zhang, Junding Sun
{"title":"MSO‐DETR: Metric space optimization for few‐shot object detection","authors":"Haifeng Sima, Manyang Wang, Lanlan Liu, Yu-dong Zhang, Junding Sun","doi":"10.1049/cit2.12342","DOIUrl":null,"url":null,"abstract":"In the metric‐based meta‐learning detection model, the distribution of training samples in the metric space has great influence on the detection performance, and this influence is usually ignored by traditional meta‐detectors. In addition, the design of metric space might be interfered with by the background noise of training samples. To tackle these issues, we propose a metric space optimisation method based on hyperbolic geometry attention and class‐agnostic activation maps. First, the geometric properties of hyperbolic spaces to establish a structured metric space are used. A variety of feature samples of different classes are embedded into the hyperbolic space with extremely low distortion. This metric space is more suitable for representing tree‐like structures between categories for image scene analysis. Meanwhile, a novel similarity measure function based on Poincaré distance is proposed to evaluate the distance of various types of objects in the feature space. In addition, the class‐agnostic activation maps (CCAMs) are employed to re‐calibrate the weight of foreground feature information and suppress background information. Finally, the decoder processes the high‐level feature information as the decoding of the query object and detects objects by predicting their locations and corresponding task encodings. Experimental evaluation is conducted on Pascal VOC and MS COCO datasets. The experiment results show that the effectiveness of the authors’ method surpasses the performance baseline of the excellent few‐shot detection models.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":null,"pages":null},"PeriodicalIF":8.4000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1049/cit2.12342","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the metric‐based meta‐learning detection model, the distribution of training samples in the metric space has great influence on the detection performance, and this influence is usually ignored by traditional meta‐detectors. In addition, the design of metric space might be interfered with by the background noise of training samples. To tackle these issues, we propose a metric space optimisation method based on hyperbolic geometry attention and class‐agnostic activation maps. First, the geometric properties of hyperbolic spaces to establish a structured metric space are used. A variety of feature samples of different classes are embedded into the hyperbolic space with extremely low distortion. This metric space is more suitable for representing tree‐like structures between categories for image scene analysis. Meanwhile, a novel similarity measure function based on Poincaré distance is proposed to evaluate the distance of various types of objects in the feature space. In addition, the class‐agnostic activation maps (CCAMs) are employed to re‐calibrate the weight of foreground feature information and suppress background information. Finally, the decoder processes the high‐level feature information as the decoding of the query object and detects objects by predicting their locations and corresponding task encodings. Experimental evaluation is conducted on Pascal VOC and MS COCO datasets. The experiment results show that the effectiveness of the authors’ method surpasses the performance baseline of the excellent few‐shot detection models.
期刊介绍:
CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.