Audio-visual saliency map: Overview, basic models and hardware implementation

2013 47th Annual Conference on Information Sciences and Systems (CISS) Pub Date : 2013-03-20 DOI:10.1109/CISS.2013.6552285

Sudarshan Ramenahalli, Daniel R. Mendat, S. Dura-Bernal, E. Culurciello, E. Niebur, A. Andreou

引用次数: 15

Abstract

In this paper we provide an overview of audiovisual saliency map models. In the simplest model, the location of auditory source is modeled as a Gaussian and use different methods of combining the auditory and visual information. We then provide experimental results with applications of simple audio-visual integration models for cognitive scene analysis. We validate the simple audio-visual saliency models with a hardware convolutional network architecture and real data recorded from moving audio-visual objects. The latter system was developed under Torch language by extending the attention.lua (code) and attention.ui (GUI) files that implement Culurciello's visual attention model.

查看原文本刊更多论文

视听显著性图:概述、基本模型和硬件实现

本文对视听显著性图模型进行了综述。在最简单的模型中，将声源的位置建模为高斯分布，并使用不同的听觉和视觉信息相结合的方法。然后，我们提供了简单的视听整合模型应用于认知场景分析的实验结果。我们用硬件卷积网络架构和从移动视听对象记录的真实数据验证了简单的视听显著性模型。后一个系统是在Torch语言下进行扩展开发的。Lua(代码)和注意力。ui (GUI)文件，实现Culurciello的视觉注意力模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 47th Annual Conference on Information Sciences and Systems (CISS)

自引率

0.00%

发文量