Natural language guided object retrieval in images

IF 0.4 4区 计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
Ahmad Ostovar, Suna Bensch, Thomas Hellström
{"title":"Natural language guided object retrieval in images","authors":"Ahmad Ostovar,&nbsp;Suna Bensch,&nbsp;Thomas Hellström","doi":"10.1007/s00236-021-00400-2","DOIUrl":null,"url":null,"abstract":"<div><p>The ability to understand the surrounding environment and being able to communicate with interacting humans are important functionalities for many automated systems where visual input (e.g., images, video) and natural language input (speech or text) have to be related to each other. Possible applications are automatic image caption generation, interactive surveillance systems, or human robot interaction. In this paper, we propose algorithms for automatic responses to natural language queries about an image. Our approach uses a predefined neural net for detection of bounding boxes and objects in images, spatial relations between bounding boxes are modeled with a neural net, the queries are analyzed with a syntactic parser, and algorithms to map natural language to properties in the images are introduced. The algorithms make use of semantic similarity and antonyms. We evaluate the performance of our approach with test users assessing the quality of our system’s generated answers.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2021-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s00236-021-00400-2","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Informatica","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s00236-021-00400-2","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

Abstract

The ability to understand the surrounding environment and being able to communicate with interacting humans are important functionalities for many automated systems where visual input (e.g., images, video) and natural language input (speech or text) have to be related to each other. Possible applications are automatic image caption generation, interactive surveillance systems, or human robot interaction. In this paper, we propose algorithms for automatic responses to natural language queries about an image. Our approach uses a predefined neural net for detection of bounding boxes and objects in images, spatial relations between bounding boxes are modeled with a neural net, the queries are analyzed with a syntactic parser, and algorithms to map natural language to properties in the images are introduced. The algorithms make use of semantic similarity and antonyms. We evaluate the performance of our approach with test users assessing the quality of our system’s generated answers.

自然语言引导下的图像对象检索
理解周围环境的能力和能够与互动的人类进行交流的能力是许多自动化系统的重要功能,其中视觉输入(例如,图像,视频)和自然语言输入(语音或文本)必须相互关联。可能的应用是自动图像标题生成、交互式监视系统或人机交互。在本文中,我们提出了自动响应关于图像的自然语言查询的算法。我们的方法使用预定义的神经网络来检测图像中的边界框和物体,用神经网络建模边界框之间的空间关系,用语法解析器分析查询,并引入将自然语言映射到图像中的属性的算法。该算法利用了语义相似和反义词。我们通过测试用户评估系统生成答案的质量来评估我们方法的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Acta Informatica
Acta Informatica 工程技术-计算机:信息系统
CiteScore
2.40
自引率
16.70%
发文量
24
审稿时长
>12 weeks
期刊介绍: Acta Informatica provides international dissemination of articles on formal methods for the design and analysis of programs, computing systems and information structures, as well as related fields of Theoretical Computer Science such as Automata Theory, Logic in Computer Science, and Algorithmics. Topics of interest include: • semantics of programming languages • models and modeling languages for concurrent, distributed, reactive and mobile systems • models and modeling languages for timed, hybrid and probabilistic systems • specification, program analysis and verification • model checking and theorem proving • modal, temporal, first- and higher-order logics, and their variants • constraint logic, SAT/SMT-solving techniques • theoretical aspects of databases, semi-structured data and finite model theory • theoretical aspects of artificial intelligence, knowledge representation, description logic • automata theory, formal languages, term and graph rewriting • game-based models, synthesis • type theory, typed calculi • algebraic, coalgebraic and categorical methods • formal aspects of performance, dependability and reliability analysis • foundations of information and network security • parallel, distributed and randomized algorithms • design and analysis of algorithms • foundations of network and communication protocols.
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信