Diya Theresa Sunil, Seema Safar, Abhijit Das, Amijith M M, Devika M Joshy
{"title":"A Comparative Analysis of Image Captioning Techniques","authors":"Diya Theresa Sunil, Seema Safar, Abhijit Das, Amijith M M, Devika M Joshy","doi":"10.1109/INCET57972.2023.10170043","DOIUrl":null,"url":null,"abstract":"Image captioning is the task of generating a textual description that accurately represents the content of an image. This task involves combining computer vision techniques, such as object recognition and scene understanding, with natural language processing to produce a human-like description of an image. Over time, various models have been introduced to perform image captioning, all aiming to accurately describe the content of an image. These models have practical applications such as improving the accessibility of multimedia content, assisting individuals with visual impairments, medical image captioning, and enhancing image search and retrieval. This paper explores some of the models and studies their efficiency using different evaluation metrics.","PeriodicalId":403008,"journal":{"name":"2023 4th International Conference for Emerging Technology (INCET)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference for Emerging Technology (INCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INCET57972.2023.10170043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Image captioning is the task of generating a textual description that accurately represents the content of an image. This task involves combining computer vision techniques, such as object recognition and scene understanding, with natural language processing to produce a human-like description of an image. Over time, various models have been introduced to perform image captioning, all aiming to accurately describe the content of an image. These models have practical applications such as improving the accessibility of multimedia content, assisting individuals with visual impairments, medical image captioning, and enhancing image search and retrieval. This paper explores some of the models and studies their efficiency using different evaluation metrics.