A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys Pub Date : 2025-05-14 DOI:10.1145/3696445

Luís Vilaça, Yi Yu, Paula Viana

引用次数: 0

Abstract

Audio-visual correlation learning aims to capture and understand natural phenomena between audio and visual data. The rapid growth of Deep Learning propelled the development of proposals that process audio-visual data and can be observed in the number of proposals in the past years. Thus encouraging the development of a comprehensive survey. Besides analyzing the models used in this context, we also discuss some tasks of definition and paradigm applied in AI multimedia. In addition, we investigate objective functions frequently used and discuss how audio-visual data is exploited in the optimization process, i.e., the different methodologies for representing knowledge in the audio-visual domain. In fact, we focus on how human-understandable mechanisms, i.e., structured knowledge that reflects comprehensible knowledge, can guide the learning process. Most importantly, we provide a summarization of the recent progress of Audio-Visual Correlation Learning (AVCL) and discuss the future research directions.

查看原文本刊更多论文

深度视听相关学习的最新进展与挑战

视听相关学习的目的是捕捉和理解视听数据之间的自然现象。深度学习的快速发展推动了处理视听数据的提案的发展，并且可以在过去几年的提案数量中观察到。从而鼓励开展全面的调查。除了分析在此背景下使用的模型外，我们还讨论了在人工智能多媒体中应用的定义和范式的一些任务。此外，我们还研究了经常使用的目标函数，并讨论了如何在优化过程中利用视听数据，即在视听领域中表示知识的不同方法。事实上，我们关注的是人类可理解的机制，即反映可理解知识的结构化知识，如何指导学习过程。本文对视听相关学习的最新研究进展进行了综述，并对今后的研究方向进行了展望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Computing Surveys 工程技术-计算机：理论方法

CiteScore

33.20

自引率

0.60%

发文量

372

审稿时长

12 months

期刊介绍： ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.