A deep low-rank semantic factorization method for micro-video multi-label classification

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Fugui Fan, Yuting Su, Yun Liu, Peiguang Jing, Kaihua Qu
{"title":"A deep low-rank semantic factorization method for micro-video multi-label classification","authors":"Fugui Fan, Yuting Su, Yun Liu, Peiguang Jing, Kaihua Qu","doi":"10.1007/s00530-024-01428-3","DOIUrl":null,"url":null,"abstract":"<p>As a prominent manifestation of user-generated content (UGC), micro-video has emerged as a pivotal medium for individuals to document and disseminate their daily experiences. In particular, micro-videos generally encompass abundant content elements that are abstractly described by a group of annotated labels. However, previous methods primarily focus on the discriminability of explicit labels while neglecting corresponding implicit semantics, which are particularly relevant for diverse micro-video characteristics. To address this problem, we develop a deep low-rank semantic factorization (DLRSF) method to perform multi-label classification of micro-videos. Specifically, we introduce a semantic embedding matrix to bridge explicit labels and implicit semantics, and further present a low-rank-regularized semantic learning module to explore the intrinsic lowest-rank semantic attributes. A correlation-driven deep semantic interaction module is designed within a deep factorization framework to enhance interactions among instance features, explicit labels and semantic embeddings. Additionally, inverse covariance analysis is employed to unveil underlying correlation structures between labels and features, thereby making the semantic embeddings more discriminative and improving model generalization ability simultaneously. Extensive experimental results on three available datasets have showcased the superiority of our DLRSF compared with the state-of-the-art methods.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"72 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01428-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

As a prominent manifestation of user-generated content (UGC), micro-video has emerged as a pivotal medium for individuals to document and disseminate their daily experiences. In particular, micro-videos generally encompass abundant content elements that are abstractly described by a group of annotated labels. However, previous methods primarily focus on the discriminability of explicit labels while neglecting corresponding implicit semantics, which are particularly relevant for diverse micro-video characteristics. To address this problem, we develop a deep low-rank semantic factorization (DLRSF) method to perform multi-label classification of micro-videos. Specifically, we introduce a semantic embedding matrix to bridge explicit labels and implicit semantics, and further present a low-rank-regularized semantic learning module to explore the intrinsic lowest-rank semantic attributes. A correlation-driven deep semantic interaction module is designed within a deep factorization framework to enhance interactions among instance features, explicit labels and semantic embeddings. Additionally, inverse covariance analysis is employed to unveil underlying correlation structures between labels and features, thereby making the semantic embeddings more discriminative and improving model generalization ability simultaneously. Extensive experimental results on three available datasets have showcased the superiority of our DLRSF compared with the state-of-the-art methods.

Abstract Image

用于微视频多标签分类的深度低阶语义因式分解方法
作为用户生成内容(UGC)的一种突出表现,微视频已成为个人记录和传播日常经历的重要媒介。特别是,微视频通常包含丰富的内容元素,这些元素由一组注释标签进行抽象描述。然而,以往的方法主要关注显性标签的可辨别性,却忽视了相应的隐性语义,而隐性语义与微视频的各种特征尤为相关。为了解决这个问题,我们开发了一种深度低阶语义因式分解(DLRSF)方法来对微视频进行多标签分类。具体来说,我们引入了一个语义嵌入矩阵来连接显式标签和隐式语义,并进一步提出了一个低阶正则化语义学习模块来探索内在的最低阶语义属性。在深度因式分解框架内设计了一个相关性驱动的深度语义交互模块,以增强实例特征、显式标签和语义嵌入之间的交互。此外,还采用了逆协方差分析来揭示标签和特征之间的潜在相关结构,从而使语义嵌入更具辨别力,并同时提高模型的泛化能力。在三个可用数据集上进行的广泛实验结果表明,与最先进的方法相比,我们的 DLRSF 更具优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Multimedia Systems
Multimedia Systems 工程技术-计算机:理论方法
CiteScore
5.40
自引率
7.70%
发文量
148
审稿时长
4.5 months
期刊介绍: This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信