Generalization or Instantiation?: Estimating the Relative Abstractness between Images and Text

Qibin Zheng, Xiaoguang Ren, Yi Liu, Wei Qin
{"title":"Generalization or Instantiation?: Estimating the Relative Abstractness between Images and Text","authors":"Qibin Zheng, Xiaoguang Ren, Yi Liu, Wei Qin","doi":"10.1145/3404555.3404610","DOIUrl":null,"url":null,"abstract":"Learning from multi-modal data is very often in current data mining and knowledge management applications. However, the information imbalance between modalities brings challenges for many multi-modal learning tasks, such as cross-modal retrieval, image captioning, and image synthesis. Understanding the cross-modal information gap is an important foundation for designing models and choosing the evaluating criteria of those applications. Especially for text and image data, existing researches have proposed the abstractness to measure the information imbalance. They evaluate the abstractness disparity by training a classifier using the manually annotated multi-modal sample pairs. However, these methods ignore the impact of the intra-modal relationship on the inter-modal abstractness; besides, the annotating process is very labor-intensive, and the quality cannot be guaranteed. In order to evaluate the text-image relationship more comprehensively and reduce the cost of evaluating, we propose the relative abstractness index (RAI) to measure the abstractness between multi-modal items, which measures the abstractness of a sample according to its certainty of differentiating the items of another modality. Besides, we proposed a cycled generating model to compute the RAI values between images and text. In contrast to existing works, the proposed index can better describe the image-text information disparity, and its computing process needs no annotated training samples.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3404555.3404610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Learning from multi-modal data is very often in current data mining and knowledge management applications. However, the information imbalance between modalities brings challenges for many multi-modal learning tasks, such as cross-modal retrieval, image captioning, and image synthesis. Understanding the cross-modal information gap is an important foundation for designing models and choosing the evaluating criteria of those applications. Especially for text and image data, existing researches have proposed the abstractness to measure the information imbalance. They evaluate the abstractness disparity by training a classifier using the manually annotated multi-modal sample pairs. However, these methods ignore the impact of the intra-modal relationship on the inter-modal abstractness; besides, the annotating process is very labor-intensive, and the quality cannot be guaranteed. In order to evaluate the text-image relationship more comprehensively and reduce the cost of evaluating, we propose the relative abstractness index (RAI) to measure the abstractness between multi-modal items, which measures the abstractness of a sample according to its certainty of differentiating the items of another modality. Besides, we proposed a cycled generating model to compute the RAI values between images and text. In contrast to existing works, the proposed index can better describe the image-text information disparity, and its computing process needs no annotated training samples.
泛化还是实例化?:估计图像和文本之间的相对抽象性
从多模态数据中学习在当前的数据挖掘和知识管理应用中非常常见。然而,模态之间的信息不平衡给许多多模态学习任务带来了挑战,如跨模态检索、图像字幕和图像合成。了解跨模态信息差距是设计模型和选择评估标准的重要基础。特别是对于文本和图像数据,已有研究提出用抽象性来衡量信息不平衡。他们通过使用手动标注的多模态样本对训练分类器来评估抽象性差异。然而,这些方法忽略了模态内关系对模态间抽象性的影响;此外,标注过程非常费力,质量无法保证。为了更全面地评价文本-图像之间的关系,降低评价成本,我们提出了相对抽象性指标(relative abstrness index, RAI)来衡量多模态项目之间的抽象性,该指标是根据样本区分另一模态项目的确定性来衡量样本的抽象性。此外,我们提出了一种循环生成模型来计算图像和文本之间的RAI值。与已有文献相比,本文提出的索引能更好地描述图像-文本信息差异,且其计算过程不需要标注训练样本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信