{"title":"From Global to Hybrid: A Review of Supervised Deep Learning for 2-D Image Feature Representation","authors":"Xinyu Dong;Qi Wang;Hongyu Deng;Zhenguo Yang;Weijian Ruan;Wu Liu;Liang Lei;Xue Wu;Youliang Tian","doi":"10.1109/TAI.2025.3526138","DOIUrl":null,"url":null,"abstract":"Computer vision is the science that aims to enable computers to emulate human visual perception, and it encompasses various techniques and methods for extracting and interpreting information from two-dimensional images. Supervised deep 2-D image feature representation is a fundamental problem in computer vision that applies deep learning techniques to extract and process information from a given 2-D image under supervised settings. The goal is to obtain a feature vector that can be utilized for various downstream computer vision applications. The quality of supervised deep 2-D image feature representation algorithms directly affects the performance of downstream applications. However, most of the existing vision research only explores supervised deep 2-D image feature representation for specific subtasks. Therefore, a comprehensive discussion on this topic is needed. In this article, we propose a taxonomy of supervised deep 2-D image feature representation methods based on four categories: global representation, region representation, hash representation, and hybrid representation, and we introduce their typical approaches. Furthermore, we perform a comparative analysis of the representative methods on three fundamental tasks: image classification, object detection, and semantic segmentation, as well as other common tasks. We also discuss the limitations of supervised deep 2-D image feature representation and investigate future directions in image representation to facilitate the advancement of computer vision through image representation.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 6","pages":"1540-1560"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10830500/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Computer vision is the science that aims to enable computers to emulate human visual perception, and it encompasses various techniques and methods for extracting and interpreting information from two-dimensional images. Supervised deep 2-D image feature representation is a fundamental problem in computer vision that applies deep learning techniques to extract and process information from a given 2-D image under supervised settings. The goal is to obtain a feature vector that can be utilized for various downstream computer vision applications. The quality of supervised deep 2-D image feature representation algorithms directly affects the performance of downstream applications. However, most of the existing vision research only explores supervised deep 2-D image feature representation for specific subtasks. Therefore, a comprehensive discussion on this topic is needed. In this article, we propose a taxonomy of supervised deep 2-D image feature representation methods based on four categories: global representation, region representation, hash representation, and hybrid representation, and we introduce their typical approaches. Furthermore, we perform a comparative analysis of the representative methods on three fundamental tasks: image classification, object detection, and semantic segmentation, as well as other common tasks. We also discuss the limitations of supervised deep 2-D image feature representation and investigate future directions in image representation to facilitate the advancement of computer vision through image representation.