Liyana Sahir Kallooriyakath, J. V., B. V, Adith P P
{"title":"Visual Question Answering: Methodologies and Challenges","authors":"Liyana Sahir Kallooriyakath, J. V., B. V, Adith P P","doi":"10.1109/ICSTCEE49637.2020.9277374","DOIUrl":null,"url":null,"abstract":"Given an image and a question in natural language based on the contents of the image, a Visual Question Answering system should model and produce an answer in natural language inferred from the information within the image. Visual question answering is a problem with increasing significance in the field of Artificial Intelligence as it lies in the crucial intersection between computer vision and natural language processing. Various methodologies have been proposed for obtaining a natural language answer to a user inputted question based on a given image. The purpose of this paper is to review various contemporary techniques for visual question answering. The advantages and limitations of these approaches are compared in this review. In addition, the areas for improvement within these approaches are discussed in this paper.","PeriodicalId":113845,"journal":{"name":"2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE)","volume":"172 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSTCEE49637.2020.9277374","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Given an image and a question in natural language based on the contents of the image, a Visual Question Answering system should model and produce an answer in natural language inferred from the information within the image. Visual question answering is a problem with increasing significance in the field of Artificial Intelligence as it lies in the crucial intersection between computer vision and natural language processing. Various methodologies have been proposed for obtaining a natural language answer to a user inputted question based on a given image. The purpose of this paper is to review various contemporary techniques for visual question answering. The advantages and limitations of these approaches are compared in this review. In addition, the areas for improvement within these approaches are discussed in this paper.