Coherence-aware and snap-triggered: A novel mechanism for audio-visual cooperative tasks

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-07 DOI:10.1016/j.eswa.2026.131559
Cunhan Guo, Heyan Huang, Ruiqi Hu, Danjie Han
{"title":"Coherence-aware and snap-triggered: A novel mechanism for audio-visual cooperative tasks","authors":"Cunhan Guo,&nbsp;Heyan Huang,&nbsp;Ruiqi Hu,&nbsp;Danjie Han","doi":"10.1016/j.eswa.2026.131559","DOIUrl":null,"url":null,"abstract":"<div><div>Audio-Visual Cooperative tasks underpin multimodal scene understanding and compel models to reconcile continuous temporal evolution with abrupt sensory transitions. We propose the Coherence-Aware and Snap-Triggered mechanism (CAST) mechanism, a plug-in temporal refinement layer without perturbing backbone parameters or demanding additional modalities. The Exponential Memory based Coherence-Aware module attenuates distant frame contributions through an exponentially decaying weight envelope, thereby preventing the persistent influence of obsolete disruptions. Complementarily, the Optical Flow based Snap-Triggered Module module registers instantaneous motion discontinuities and reallocates attention toward nascent events. Operating in concert, these modules yield a representation that remains coherent across smooth transitions yet responsive to sudden perturbations. Empirical evaluation across multiple AVC benchmarks demonstrates consistent superiority over established baselines, corroborating that CAST enhances temporal fidelity and, by extension, the reliability of downstream multimodal decisions.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131559"},"PeriodicalIF":7.5000,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417426004720","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/7 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Audio-Visual Cooperative tasks underpin multimodal scene understanding and compel models to reconcile continuous temporal evolution with abrupt sensory transitions. We propose the Coherence-Aware and Snap-Triggered mechanism (CAST) mechanism, a plug-in temporal refinement layer without perturbing backbone parameters or demanding additional modalities. The Exponential Memory based Coherence-Aware module attenuates distant frame contributions through an exponentially decaying weight envelope, thereby preventing the persistent influence of obsolete disruptions. Complementarily, the Optical Flow based Snap-Triggered Module module registers instantaneous motion discontinuities and reallocates attention toward nascent events. Operating in concert, these modules yield a representation that remains coherent across smooth transitions yet responsive to sudden perturbations. Empirical evaluation across multiple AVC benchmarks demonstrates consistent superiority over established baselines, corroborating that CAST enhances temporal fidelity and, by extension, the reliability of downstream multimodal decisions.
连贯感知和快照触发:一种新的视听合作任务机制
视听合作任务支持多模态场景理解,并迫使模型协调连续的时间演变与突然的感觉转变。我们提出了一致性感知和快照触发机制(CAST)机制,这是一种不干扰骨干参数或要求额外模式的插件时间优化层。基于指数内存的相干感知模块通过指数衰减权重包络来衰减远端帧贡献,从而防止过时中断的持续影响。此外,基于光流的快照触发模块模块记录瞬时运动不连续并将注意力重新分配给新生事件。这些模块协同工作,产生了一种表示,在平稳过渡期间保持连贯,但对突然的扰动做出反应。对多个AVC基准的实证评估表明,CAST优于已建立的基线,证实了CAST提高了时间保真度,进而提高了下游多式联运决策的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书