Fugui Fan, Yuting Su, Yun Liu, Peiguang Jing, Kaihua Qu
{"title":"用于微视频多标签分类的深度低阶语义因式分解方法","authors":"Fugui Fan, Yuting Su, Yun Liu, Peiguang Jing, Kaihua Qu","doi":"10.1007/s00530-024-01428-3","DOIUrl":null,"url":null,"abstract":"<p>As a prominent manifestation of user-generated content (UGC), micro-video has emerged as a pivotal medium for individuals to document and disseminate their daily experiences. In particular, micro-videos generally encompass abundant content elements that are abstractly described by a group of annotated labels. However, previous methods primarily focus on the discriminability of explicit labels while neglecting corresponding implicit semantics, which are particularly relevant for diverse micro-video characteristics. To address this problem, we develop a deep low-rank semantic factorization (DLRSF) method to perform multi-label classification of micro-videos. Specifically, we introduce a semantic embedding matrix to bridge explicit labels and implicit semantics, and further present a low-rank-regularized semantic learning module to explore the intrinsic lowest-rank semantic attributes. A correlation-driven deep semantic interaction module is designed within a deep factorization framework to enhance interactions among instance features, explicit labels and semantic embeddings. Additionally, inverse covariance analysis is employed to unveil underlying correlation structures between labels and features, thereby making the semantic embeddings more discriminative and improving model generalization ability simultaneously. Extensive experimental results on three available datasets have showcased the superiority of our DLRSF compared with the state-of-the-art methods.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"72 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep low-rank semantic factorization method for micro-video multi-label classification\",\"authors\":\"Fugui Fan, Yuting Su, Yun Liu, Peiguang Jing, Kaihua Qu\",\"doi\":\"10.1007/s00530-024-01428-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>As a prominent manifestation of user-generated content (UGC), micro-video has emerged as a pivotal medium for individuals to document and disseminate their daily experiences. In particular, micro-videos generally encompass abundant content elements that are abstractly described by a group of annotated labels. However, previous methods primarily focus on the discriminability of explicit labels while neglecting corresponding implicit semantics, which are particularly relevant for diverse micro-video characteristics. To address this problem, we develop a deep low-rank semantic factorization (DLRSF) method to perform multi-label classification of micro-videos. Specifically, we introduce a semantic embedding matrix to bridge explicit labels and implicit semantics, and further present a low-rank-regularized semantic learning module to explore the intrinsic lowest-rank semantic attributes. A correlation-driven deep semantic interaction module is designed within a deep factorization framework to enhance interactions among instance features, explicit labels and semantic embeddings. Additionally, inverse covariance analysis is employed to unveil underlying correlation structures between labels and features, thereby making the semantic embeddings more discriminative and improving model generalization ability simultaneously. Extensive experimental results on three available datasets have showcased the superiority of our DLRSF compared with the state-of-the-art methods.</p>\",\"PeriodicalId\":51138,\"journal\":{\"name\":\"Multimedia Systems\",\"volume\":\"72 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multimedia Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00530-024-01428-3\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01428-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
A deep low-rank semantic factorization method for micro-video multi-label classification
As a prominent manifestation of user-generated content (UGC), micro-video has emerged as a pivotal medium for individuals to document and disseminate their daily experiences. In particular, micro-videos generally encompass abundant content elements that are abstractly described by a group of annotated labels. However, previous methods primarily focus on the discriminability of explicit labels while neglecting corresponding implicit semantics, which are particularly relevant for diverse micro-video characteristics. To address this problem, we develop a deep low-rank semantic factorization (DLRSF) method to perform multi-label classification of micro-videos. Specifically, we introduce a semantic embedding matrix to bridge explicit labels and implicit semantics, and further present a low-rank-regularized semantic learning module to explore the intrinsic lowest-rank semantic attributes. A correlation-driven deep semantic interaction module is designed within a deep factorization framework to enhance interactions among instance features, explicit labels and semantic embeddings. Additionally, inverse covariance analysis is employed to unveil underlying correlation structures between labels and features, thereby making the semantic embeddings more discriminative and improving model generalization ability simultaneously. Extensive experimental results on three available datasets have showcased the superiority of our DLRSF compared with the state-of-the-art methods.
期刊介绍:
This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.