Diya Theresa Sunil, Seema Safar, Abhijit Das, Amijith M M, Devika M Joshy
{"title":"图像字幕技术的比较分析","authors":"Diya Theresa Sunil, Seema Safar, Abhijit Das, Amijith M M, Devika M Joshy","doi":"10.1109/INCET57972.2023.10170043","DOIUrl":null,"url":null,"abstract":"Image captioning is the task of generating a textual description that accurately represents the content of an image. This task involves combining computer vision techniques, such as object recognition and scene understanding, with natural language processing to produce a human-like description of an image. Over time, various models have been introduced to perform image captioning, all aiming to accurately describe the content of an image. These models have practical applications such as improving the accessibility of multimedia content, assisting individuals with visual impairments, medical image captioning, and enhancing image search and retrieval. This paper explores some of the models and studies their efficiency using different evaluation metrics.","PeriodicalId":403008,"journal":{"name":"2023 4th International Conference for Emerging Technology (INCET)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparative Analysis of Image Captioning Techniques\",\"authors\":\"Diya Theresa Sunil, Seema Safar, Abhijit Das, Amijith M M, Devika M Joshy\",\"doi\":\"10.1109/INCET57972.2023.10170043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image captioning is the task of generating a textual description that accurately represents the content of an image. This task involves combining computer vision techniques, such as object recognition and scene understanding, with natural language processing to produce a human-like description of an image. Over time, various models have been introduced to perform image captioning, all aiming to accurately describe the content of an image. These models have practical applications such as improving the accessibility of multimedia content, assisting individuals with visual impairments, medical image captioning, and enhancing image search and retrieval. This paper explores some of the models and studies their efficiency using different evaluation metrics.\",\"PeriodicalId\":403008,\"journal\":{\"name\":\"2023 4th International Conference for Emerging Technology (INCET)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 4th International Conference for Emerging Technology (INCET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INCET57972.2023.10170043\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference for Emerging Technology (INCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INCET57972.2023.10170043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Comparative Analysis of Image Captioning Techniques
Image captioning is the task of generating a textual description that accurately represents the content of an image. This task involves combining computer vision techniques, such as object recognition and scene understanding, with natural language processing to produce a human-like description of an image. Over time, various models have been introduced to perform image captioning, all aiming to accurately describe the content of an image. These models have practical applications such as improving the accessibility of multimedia content, assisting individuals with visual impairments, medical image captioning, and enhancing image search and retrieval. This paper explores some of the models and studies their efficiency using different evaluation metrics.