基于语义信息的少镜头图像识别技术综述

Review of computer engineering research Pub Date : 2023-09-15 DOI:10.18488/76.v10i2.3472

Liyong Guo, Erzam Marlisah, Hamidah Ibrahim, Noridayu Manshor

{"title":"基于语义信息的少镜头图像识别技术综述","authors":"Liyong Guo, Erzam Marlisah, Hamidah Ibrahim, Noridayu Manshor","doi":"10.18488/76.v10i2.3472","DOIUrl":null,"url":null,"abstract":"In recent years, the utilization of deep learning techniques has been employed in the field of image recognition with the aim of improving performance. However, deep learning demands a substantial amount of labeled data for model training, a process that is both expensive and time-consuming. In order to tackle this particular difficulty, the approach of few-shot learning (FSL) has emerged as a viable alternative. FSL, or Few-Shot Learning, is a computational approach that aims to replicate the cognitive processes observed in humans. By using a small set of examples and experiences, FSL enables the acquisition of new concepts. Research in the field of FSL has investigated many approaches to extracting the highest amount of information from limited data or making use of affordable and easily accessible sources of information. Researchers have been incorporating outside data into FSL techniques more frequently. This paper conducts an in-depth exploration of leveraging semantic information to enhance few-shot learning. By reviewing papers from the last five years in WOS, IEEE, and Science Direct (some papers in arXiv are also used), this study delves into the strategies employed to bridge the gap between visual and semantic information. The review extends to encompass zero-shot learning, which is considered a subcategory of FSL, enriching the analysis. Moreover, this paper identifies the potential of employing semantic information to enhance fine-grained few-shot (FGFS) learning. Techniques such as direct projection and the application of generative adversarial networks (GANs) emerge as promising avenues to accomplish this enhancement.","PeriodicalId":493889,"journal":{"name":"Review of computer engineering research","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A review of few-shot image recognition using semantic information\",\"authors\":\"Liyong Guo, Erzam Marlisah, Hamidah Ibrahim, Noridayu Manshor\",\"doi\":\"10.18488/76.v10i2.3472\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the utilization of deep learning techniques has been employed in the field of image recognition with the aim of improving performance. However, deep learning demands a substantial amount of labeled data for model training, a process that is both expensive and time-consuming. In order to tackle this particular difficulty, the approach of few-shot learning (FSL) has emerged as a viable alternative. FSL, or Few-Shot Learning, is a computational approach that aims to replicate the cognitive processes observed in humans. By using a small set of examples and experiences, FSL enables the acquisition of new concepts. Research in the field of FSL has investigated many approaches to extracting the highest amount of information from limited data or making use of affordable and easily accessible sources of information. Researchers have been incorporating outside data into FSL techniques more frequently. This paper conducts an in-depth exploration of leveraging semantic information to enhance few-shot learning. By reviewing papers from the last five years in WOS, IEEE, and Science Direct (some papers in arXiv are also used), this study delves into the strategies employed to bridge the gap between visual and semantic information. The review extends to encompass zero-shot learning, which is considered a subcategory of FSL, enriching the analysis. Moreover, this paper identifies the potential of employing semantic information to enhance fine-grained few-shot (FGFS) learning. Techniques such as direct projection and the application of generative adversarial networks (GANs) emerge as promising avenues to accomplish this enhancement.\",\"PeriodicalId\":493889,\"journal\":{\"name\":\"Review of computer engineering research\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Review of computer engineering research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18488/76.v10i2.3472\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Review of computer engineering research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18488/76.v10i2.3472","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，深度学习技术已被应用于图像识别领域，目的是提高性能。然而，深度学习需要大量的标记数据来进行模型训练，这一过程既昂贵又耗时。为了解决这个特殊的困难，少量学习(FSL)的方法已经成为一个可行的选择。FSL，即Few-Shot Learning，是一种旨在复制在人类中观察到的认知过程的计算方法。通过使用一小部分示例和经验，FSL能够获得新概念。FSL领域的研究已经研究了从有限的数据中提取最多信息或利用负担得起且易于获取的信息源的许多方法。研究人员更频繁地将外部数据纳入FSL技术。本文对利用语义信息增强少镜头学习进行了深入的探索。通过回顾过去五年在WOS、IEEE和Science Direct上发表的论文(也使用了arXiv上的一些论文)，本研究深入探讨了用于弥合视觉信息和语义信息之间差距的策略。这篇综述扩展到包括零射击学习，它被认为是FSL的一个子类，丰富了分析。此外，本文还确定了使用语义信息来增强细粒度少射(FGFS)学习的潜力。直接投影和生成对抗网络(gan)的应用等技术是实现这一增强的有希望的途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A review of few-shot image recognition using semantic information

In recent years, the utilization of deep learning techniques has been employed in the field of image recognition with the aim of improving performance. However, deep learning demands a substantial amount of labeled data for model training, a process that is both expensive and time-consuming. In order to tackle this particular difficulty, the approach of few-shot learning (FSL) has emerged as a viable alternative. FSL, or Few-Shot Learning, is a computational approach that aims to replicate the cognitive processes observed in humans. By using a small set of examples and experiences, FSL enables the acquisition of new concepts. Research in the field of FSL has investigated many approaches to extracting the highest amount of information from limited data or making use of affordable and easily accessible sources of information. Researchers have been incorporating outside data into FSL techniques more frequently. This paper conducts an in-depth exploration of leveraging semantic information to enhance few-shot learning. By reviewing papers from the last five years in WOS, IEEE, and Science Direct (some papers in arXiv are also used), this study delves into the strategies employed to bridge the gap between visual and semantic information. The review extends to encompass zero-shot learning, which is considered a subcategory of FSL, enriching the analysis. Moreover, this paper identifies the potential of employing semantic information to enhance fine-grained few-shot (FGFS) learning. Techniques such as direct projection and the application of generative adversarial networks (GANs) emerge as promising avenues to accomplish this enhancement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Review of computer engineering research

CiteScore

1.90

自引率

0.00%

发文量