{"title":"From images to sentences via spatial relations","authors":"A. Abella, J. Kender","doi":"10.1109/ISIU.1999.824875","DOIUrl":null,"url":null,"abstract":"This work presents a conceptual framework for representing, manipulating, measuring, and communicating in natural language several ideas about topological (non-metric) spatial locations, object spatial contexts, and user expectations of spatial relationships. It articulates a theory of spatial relations, how they can be represented as fuzzy predicates internally, and how they can be appropriately derived from, imagery; then, how they can be augmented or filtered using prior knowledge, and lastly, how they can produce natural language statements about location and space. This framework quantifies the notions of context and vagueness, so that all spatial relations are measurably accurate, provably efficient, and matched to users' expectations. The work makes explicit two critical heuristics for reducing the complexity of the relationships implicit in imagery, one a general rule for single object descriptions, and the other a general rule for rank ordering object relationships. A derived working system combines variable aspects of computer science and linguistics in such a way so as to be extensible to many environments. The system has been demonstrated both in, a landmark navigation task and in a medical task, two very separate domains, and has been evaluated in both.","PeriodicalId":227256,"journal":{"name":"Proceedings Integration of Speech and Image Understanding","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Integration of Speech and Image Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIU.1999.824875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
This work presents a conceptual framework for representing, manipulating, measuring, and communicating in natural language several ideas about topological (non-metric) spatial locations, object spatial contexts, and user expectations of spatial relationships. It articulates a theory of spatial relations, how they can be represented as fuzzy predicates internally, and how they can be appropriately derived from, imagery; then, how they can be augmented or filtered using prior knowledge, and lastly, how they can produce natural language statements about location and space. This framework quantifies the notions of context and vagueness, so that all spatial relations are measurably accurate, provably efficient, and matched to users' expectations. The work makes explicit two critical heuristics for reducing the complexity of the relationships implicit in imagery, one a general rule for single object descriptions, and the other a general rule for rank ordering object relationships. A derived working system combines variable aspects of computer science and linguistics in such a way so as to be extensible to many environments. The system has been demonstrated both in, a landmark navigation task and in a medical task, two very separate domains, and has been evaluated in both.