探索视觉解释的连贯性

Malihe Alikhani, Matthew Stone
{"title":"探索视觉解释的连贯性","authors":"Malihe Alikhani, Matthew Stone","doi":"10.1109/MIPR.2018.00063","DOIUrl":null,"url":null,"abstract":"A wide range of communicative artifacts—perhaps the majority—involve the coordinated presentation of visual and linguistic information. We envisage computer systems that support access to information by using rich representations of the interpretation of such multimodal presentations. This paper advocates organizing such representations in terms of coherence relations [2, 19], a fundamental construct from the theory of natural language discourse that is often invoked to explain the integrated interpretation of the diverse communicative actions in face-to-face conversation [9, 25, 35]. Coherence relations come in constrained classes, such as the Explanation, Narration and Parallel relations, each of which establishes specific kinds of structural, logical, and intentional relationships among communicative actions. Representing these relationships can therefore provide a scaffold for organizing, disambiguating and integrating the interpretation of communication across modalities. This paper uses a case study of instructions presented using text and pictures to motivate and describe an analysis of multimodal discourse interpretation in terms of coherence relations and to sketch a roadmap for operationalizing the approach in computer systems.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Exploring Coherence in Visual Explanations\",\"authors\":\"Malihe Alikhani, Matthew Stone\",\"doi\":\"10.1109/MIPR.2018.00063\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A wide range of communicative artifacts—perhaps the majority—involve the coordinated presentation of visual and linguistic information. We envisage computer systems that support access to information by using rich representations of the interpretation of such multimodal presentations. This paper advocates organizing such representations in terms of coherence relations [2, 19], a fundamental construct from the theory of natural language discourse that is often invoked to explain the integrated interpretation of the diverse communicative actions in face-to-face conversation [9, 25, 35]. Coherence relations come in constrained classes, such as the Explanation, Narration and Parallel relations, each of which establishes specific kinds of structural, logical, and intentional relationships among communicative actions. Representing these relationships can therefore provide a scaffold for organizing, disambiguating and integrating the interpretation of communication across modalities. This paper uses a case study of instructions presented using text and pictures to motivate and describe an analysis of multimodal discourse interpretation in terms of coherence relations and to sketch a roadmap for operationalizing the approach in computer systems.\",\"PeriodicalId\":320000,\"journal\":{\"name\":\"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MIPR.2018.00063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIPR.2018.00063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

广泛的交流人工制品——可能是大多数——涉及视觉和语言信息的协调呈现。我们设想计算机系统通过使用这种多模态表示的解释的丰富表示来支持对信息的访问。本文主张用连贯关系(coherence relations)来组织这些表征[2,19],连贯关系是自然语言话语理论的一个基本结构,经常被用来解释面对面对话中各种交际行为的综合解释[9,25,35]。连贯关系分为三类,如解释关系、叙述关系和平行关系,每一种关系都在交际行为之间建立了特定的结构、逻辑和意图关系。因此,表示这些关系可以为组织、消除歧义和整合跨模式通信的解释提供一个框架。本文通过对使用文本和图片的说明的案例研究,从连贯关系的角度激发和描述了多模态语篇解释的分析,并绘制了在计算机系统中实施该方法的路线图。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring Coherence in Visual Explanations
A wide range of communicative artifacts—perhaps the majority—involve the coordinated presentation of visual and linguistic information. We envisage computer systems that support access to information by using rich representations of the interpretation of such multimodal presentations. This paper advocates organizing such representations in terms of coherence relations [2, 19], a fundamental construct from the theory of natural language discourse that is often invoked to explain the integrated interpretation of the diverse communicative actions in face-to-face conversation [9, 25, 35]. Coherence relations come in constrained classes, such as the Explanation, Narration and Parallel relations, each of which establishes specific kinds of structural, logical, and intentional relationships among communicative actions. Representing these relationships can therefore provide a scaffold for organizing, disambiguating and integrating the interpretation of communication across modalities. This paper uses a case study of instructions presented using text and pictures to motivate and describe an analysis of multimodal discourse interpretation in terms of coherence relations and to sketch a roadmap for operationalizing the approach in computer systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信