IEEE Transactions on Multimedia最新文献

筛选
英文 中文
IBFusion: An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy IBFusion:基于红外目标掩码和双模特征提取策略的红外与可见光图像融合方法
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-05 DOI: 10.1109/TMM.2024.3410113
Yang Bai;Meijing Gao;Shiyu Li;Ping Wang;Ning Guan;Haozheng Yin;Yonghao Yan
{"title":"IBFusion: An Infrared and Visible Image Fusion Method Based on Infrared Target Mask and Bimodal Feature Extraction Strategy","authors":"Yang Bai;Meijing Gao;Shiyu Li;Ping Wang;Ning Guan;Haozheng Yin;Yonghao Yan","doi":"10.1109/TMM.2024.3410113","DOIUrl":"10.1109/TMM.2024.3410113","url":null,"abstract":"The fusion of infrared (IR) and visible (VIS) images aims to capture complementary information from diverse sensors, resulting in a fused image that enhances the overall human perception of the scene. However, existing fusion methods face challenges preserving diverse feature information, leading to cross-modal interference, feature degradation, and detail loss in the fused image. To solve the above problems, this paper proposes an image fusion method based on the infrared target mask and bimodal feature extraction strategy, termed IBFusion. Firstly, we define an infrared target mask, employing it to retain crucial information from the source images in the fused result. Additionally, we devise a mixed loss function, encompassing content loss, gradient loss, and structure loss, to ensure the coherence of the fused image with the IR and VIS images. Then, the mask is introduced into the mixed loss function to guide feature extraction and unsupervised network optimization. Secondly, we create a bimodal feature extraction strategy and construct a Dual-channel Multi-scale Feature Extraction Module (DMFEM) to extract thermal target information from the IR image and background texture information from the VIS image. This module retains the complementary information of the two source images. Finally, we use the Feature Fusion Module (FFM) to fuse the features effectively, generating the fusion result. Experiments on three public datasets demonstrate that the fusion results of our method have prominent infrared targets and clear texture details. Both subjective and objective assessments are better than the other twelve advanced algorithms, proving our method's effectiveness.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10610-10622"},"PeriodicalIF":8.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Semantic Polymorphic Mapping for Text-Based Person Retrieval 学习语义多态映射,实现基于文本的人员检索
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-05 DOI: 10.1109/TMM.2024.3410129
Jiayi Li;Min Jiang;Jun Kong;Xuefeng Tao;Xi Luo
{"title":"Learning Semantic Polymorphic Mapping for Text-Based Person Retrieval","authors":"Jiayi Li;Min Jiang;Jun Kong;Xuefeng Tao;Xi Luo","doi":"10.1109/TMM.2024.3410129","DOIUrl":"10.1109/TMM.2024.3410129","url":null,"abstract":"Text-Based Person Retrieval (TBPR) aims to identify a particular individual within an extensive image gallery using text as the query. The principal challenge inherent in the TBPR task revolves around how to map cross-modal information to a potential common space and learn a generic representation. Previous methods have primarily focused on aligning singular text-image pairs, disregarding the inherent polymorphism within both images and natural language expressions for the same individual. Moreover, these methods have also ignored the impact of semantic polymorphism-based intra-modal data distribution on cross-modal matching. Recent methods employ cross-modal implicit information reconstruction to enhance inter-modal connections. However, the process of information reconstruction remains ambiguous. To address these issues, we propose the Learning Semantic Polymorphic Mapping (LSPM) framework, facilitated by the prowess of pre-trained cross-modal models. Firstly, to learn cross-modal information representations with better robustness, we design the Inter-modal Information Aggregation (Inter-IA) module to achieve cross-modal polymorphic mapping, fortifying the foundation of our information representations. Secondly, to attain a more concentrated intra-modal information representation based on semantic polymorphism, we design Intra-modal Information Aggregation (Intra-IA) module to further constrain the embeddings. Thirdly, to further explore the potential of cross-modal interactions within the model, we design the implicit reasoning module, Masked Information Guided Reconstruction (MIGR), with constraint guidance to elevate overall performance. Extensive experiments on both CUHK-PEDES and ICFG-PEDES datasets show that we achieve state-of-the-art results on Rank-1, mAP and mINP compared to existing methods.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10678-10691"},"PeriodicalIF":8.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Pre-Trained Model-Based Speech Emotion Recognition From a Low-Level Speech Feature Perspective 从低级语音特征角度改进基于预训练模型的语音情感识别
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-05 DOI: 10.1109/TMM.2024.3410133
Ke Liu;Jiwei Wei;Jie Zou;Peng Wang;Yang Yang;Heng Tao Shen
{"title":"Improving Pre-Trained Model-Based Speech Emotion Recognition From a Low-Level Speech Feature Perspective","authors":"Ke Liu;Jiwei Wei;Jie Zou;Peng Wang;Yang Yang;Heng Tao Shen","doi":"10.1109/TMM.2024.3410133","DOIUrl":"10.1109/TMM.2024.3410133","url":null,"abstract":"Multi-view speech emotion recognition (SER) based on the pre-trained model has gained attention in the last two years, which shows great potential in improving the model performance in speaker-independent scenarios. However, the existing work either relies on various fine-tuning methods or uses excessive feature views with complex fusion strategies, causing the increase of complexity with limited performance benefit. In this paper, we improve multi-view SER based on the pre-trained model from the perspective of a low-level speech feature. Specifically, we forgo fine-tuning the pre-trained model and instead focus on learning effective features hidden in the low-level speech feature mel-scale frequency cepstral coefficient (MFCC). We propose a \u0000<bold>t</b>\u0000wo-\u0000<bold>s</b>\u0000tream \u0000<bold>p</b>\u0000ooling \u0000<bold>c</b>\u0000hannel \u0000<bold>a</b>\u0000ttention (\u0000<bold>TsPCA</b>\u0000) module to discriminatively weight the channel dimensions of the features derived from MFCC. This module enables inter-channel interaction and learning of emotion sequence information across channels. Furthermore, we design a simple but effective feature view fusion strategy to learn robust representations. In the comparison experiments, our method achieves the WA and UA of 73.97%/74.69% and 74.61%/75.66% on the IEMOCAP dataset, 97.21% and 97.11% on the Emo-DB dataset, 77.08% and 77.34% on the RAVDESS dataset, and 74.38% and 71.43% on the SAVEE dataset. Extensive experiments on the four datasets demonstrate that our method consistently surpasses existing methods and achieves a new State-of-the-Art result.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10623-10636"},"PeriodicalIF":8.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Triple Consistency for Transparent Cheating Problem in Light Field Depth Estimation 光场深度估计中透明作弊问题的三重一致性
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-05 DOI: 10.1109/TMM.2024.3410139
Zhenglong Cui;Da Yang;Hao Sheng;Sizhe Wang;Rongshan Chen;Ruixuan Cong;Wei Ke
{"title":"Triple Consistency for Transparent Cheating Problem in Light Field Depth Estimation","authors":"Zhenglong Cui;Da Yang;Hao Sheng;Sizhe Wang;Rongshan Chen;Ruixuan Cong;Wei Ke","doi":"10.1109/TMM.2024.3410139","DOIUrl":"10.1109/TMM.2024.3410139","url":null,"abstract":"Depth estimation extracting scenes' structural information is a key step in various light field(LF) applications. However, most existing depth estimation methods are based on the Lambertian assumption, which limits the application in non-Lambertian scenes. In this paper, we discover a unique transparent cheating problem for non-Lambertian scenes which can effectively spoof depth estimation algorithms based on photo consistency. It arises because the spatial consistency and the linear structure superimposed on the epipolar plane image form new spurious lines. Therefore, we propose centrifugal consistency and centripetal consistency for separating the depth information of multi-layer scenes and correcting the error due to the transparent cheating problem, respectively. By comparing the distributional characteristics and the number of minimal values of photo consistency and centrifugal consistency, non-Lambertian regions can be efficiently identified and initial depth estimates obtained. Then centripetal consistency is exploited to reject the projection from different layers and to address transparent cheating. By assigning decreasing weights radiating outward from the central view, pixels with a concentration of colors close to the central viewpoint are considered more significant. The problem of underestimating the depth of background caused by transparent cheating is effectively solved and corrected. Experiments on synthetic and real-world data show that our method can produce high-quality depth estimation under the transparency and the reflectivity of 90% to 20%. The proposed triple-consistency-based algorithm outperforms state-of-the-art LF depth estimation methods in terms of accuracy and robustness.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10651-10664"},"PeriodicalIF":8.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Width-Adaptive CNN: Fast CU Partition Prediction for VVC Screen Content Coding 宽度自适应 CNN:针对 VVC 屏幕内容编码的快速 CU 分区预测
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-05 DOI: 10.1109/TMM.2024.3410116
Chao Jiao;Huanqiang Zeng;Jing Chen;Chih-Hsien Hsia;Tianlei Wang;Kai-Kuang Ma
{"title":"Width-Adaptive CNN: Fast CU Partition Prediction for VVC Screen Content Coding","authors":"Chao Jiao;Huanqiang Zeng;Jing Chen;Chih-Hsien Hsia;Tianlei Wang;Kai-Kuang Ma","doi":"10.1109/TMM.2024.3410116","DOIUrl":"10.1109/TMM.2024.3410116","url":null,"abstract":"Screen content coding (SCC) in Versatile Video Coding (VVC) improves the coding efficiency of screen content videos (SCVs) significantly but results in high computational complexity due to the quad-tree plus multi-type tree (QTMT) structure of the coding unit (CU) partitioning. Therefore, we make the first attempt to reduce the encoding complexity from the perspective of CU partitioning for SCC in VVC. To this end, a fast CU partition prediction method is technically developed for VVC-SCC. First, to solve the problem of lacking sufficient SCC training data, SCVs are collected to establish a database containing CUs of various sizes and corresponding partition labels. Second, to determine the partition decision in advance, a novel WA-CNN model is proposed, which is capable of predicting two large CUs for VVC-SCC by adjusting the feature channels based on the size of input CU blocks. Finally, considering the imbalanced proportion of diverse partition decisions, a loss function with the weight that equalizes the contribution of imbalanced data is formulated to train the proposed WA-CNN model. Experimental results show that the proposed model reduces the SCC intra-encoding time by 35.65%\u0000<inline-formula><tex-math>${sim }$</tex-math></inline-formula>\u000038.31% with an average of 1.84%\u0000<inline-formula><tex-math>${sim }$</tex-math></inline-formula>\u00002.42% BDBR increase.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9372-9382"},"PeriodicalIF":8.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CDKM: Common and Distinct Knowledge Mining Network With Content Interaction for Dense Captioning CDKM:用于密集字幕的具有内容交互性的共性和特性知识挖掘网络
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-04 DOI: 10.1109/TMM.2024.3407695
Hongyu Deng;Yushan Xie;Qi Wang;Jianjun Wang;Weijian Ruan;Wu Liu;Yong-Jin Liu
{"title":"CDKM: Common and Distinct Knowledge Mining Network With Content Interaction for Dense Captioning","authors":"Hongyu Deng;Yushan Xie;Qi Wang;Jianjun Wang;Weijian Ruan;Wu Liu;Yong-Jin Liu","doi":"10.1109/TMM.2024.3407695","DOIUrl":"10.1109/TMM.2024.3407695","url":null,"abstract":"The dense captioning task aims at detecting multiple salient regions of an image and describing them separately in natural language. Although significant advancements in the field of dense captioning have been made, there are still some limitations to existing methods in recent years. On the one hand, most dense captioning methods lack strong target detection capabilities and struggle to cover all relevant content when dealing with target-intensive images. On the other hand, current transformer-based methods are powerful but neglect the acquisition and utilization of contextual information, hindering the visual understanding of local areas. To address these issues, we propose a common and distinct knowledge-mining network with content interaction for the task of dense captioning. Our network has a knowledge mining mechanism that improves the detection of salient targets by capturing common and distinct knowledge from multi-scale features. We further propose a content interaction module that combines region features into a unique context based on their correlation. Our experiments on various benchmarks have shown that the proposed method outperforms the current state-of-the-art methods.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10462-10473"},"PeriodicalIF":8.4,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating the Semantics via Sector Embedding for Image-Text Retrieval 通过扇形嵌入估计语义,实现图像文本检索
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-04 DOI: 10.1109/TMM.2024.3407664
Zheng Wang;Zhenwei Gao;Mengqun Han;Yang Yang;Heng Tao Shen
{"title":"Estimating the Semantics via Sector Embedding for Image-Text Retrieval","authors":"Zheng Wang;Zhenwei Gao;Mengqun Han;Yang Yang;Heng Tao Shen","doi":"10.1109/TMM.2024.3407664","DOIUrl":"10.1109/TMM.2024.3407664","url":null,"abstract":"Based on deterministic single-point embedding, most extant image-text retrieval methods only focus on the match of ground truth while suffering from one-to-many correspondence, where besides annotated positives, many similar instances of another modality should be retrieved by a given query. Recent solutions of probabilistic embedding and rectangle mapping still encounter some drawbacks, albeit their promising effectiveness at multiple matches. Meanwhile, the exploration of one-to-many correspondence is still insufficient. Therefore, this paper proposes a novel geometric representation to \u0000<underline>E</u>\u0000stimate the \u0000<underline>S</u>\u0000emantics of heterogeneous data via \u0000<underline>S</u>\u0000ector \u0000<underline>E</u>\u0000mbedding (dubbed \u0000<bold>ESSE</b>\u0000). Specifically, a given image/text can be projected as a sector, where its symmetric axis represents mean semantics and the aperture estimates uncertainty. Further, a sector matching loss is introduced to better handle the multiplicity by considering the sine of included angles as distance calculation, which encourages candidates to be contained by the apertures of a query sector. The experimental results on three widely used benchmarks CUB, Flickr30 K and MS-COCO reveal that sector embedding can achieve competitive performance on multiple matches and also improve the traditional ground-truth matching of the baselines. Additionally, we also verify the generalization to video-text retrieval on two extensively used datasets of MSRVTT and MSVD, and to text-based person retrieval on CUHK-PEDES. This superiority and effectiveness can also demonstrate that the bounded property of the aperture can better estimate semantic uncertainty when compared to prior remedies.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10342-10353"},"PeriodicalIF":8.4,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Logit Variated Product Quantization Based on Parts Interaction and Metric Learning With Knowledge Distillation for Fine-Grained Image Retrieval 基于部件交互和度量学习的 Logit 变量产品量化与知识提炼,用于细粒度图像检索
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-03 DOI: 10.1109/TMM.2024.3407661
Lei Ma;Xin Luo;Hanyu Hong;Fanman Meng;Qingbo Wu
{"title":"Logit Variated Product Quantization Based on Parts Interaction and Metric Learning With Knowledge Distillation for Fine-Grained Image Retrieval","authors":"Lei Ma;Xin Luo;Hanyu Hong;Fanman Meng;Qingbo Wu","doi":"10.1109/TMM.2024.3407661","DOIUrl":"10.1109/TMM.2024.3407661","url":null,"abstract":"Image retrieval with fine-grained categories is an extremely challenging task due to the high intraclass variance and low interclass variance. Most previous works have focused on localizing discriminative image regions in isolation, but have rarely exploited correlations across the different discriminative regions to alleviate intraclass differences. In addition, the intraclass compactness of embedding features is ensured by extra regularization terms that only exist during the training phase, which appear to generalize less well in the inference phase. Finally, the information granularity of the distance measure should distinguish subtle visual differences and the correlation between the embedding features and the quantized features should be maximized sufficiently. To address the above issues, we propose a logit variated product quantization method based on part interaction and metric learning with knowledge distillation for fine-grained image retrieval. Specifically, we introduce a causal context module into the deep navigator to generate discriminative regions and utilize a channelwise cross-part fusion transformer to model the part correlations while alleviating intraclass differences. Subsequently, we design a logit variation module based on a weighted sum scheme to further reduce the intraclass variance of the embedding features directly and enhance the learning power of the quantization model. Finally, we propose a novel product quantization loss based on metric learning and knowledge distillation to enhance the correlation between the embedding features and the quantized features and allow the quantization features to learn more knowledge from the embedding features. The experimental results on several fine-grained datasets demonstrate that the proposed method is superior to state-of-the-art fine-grained image retrieval methods.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10406-10419"},"PeriodicalIF":8.4,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PhotoStyle60: A Photographic Style Dataset for Photo Authorship Attribution and Photographic Style Transfer 照片风格 60:用于照片作者归属和摄影风格转换的摄影风格数据集
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-06-03 DOI: 10.1109/TMM.2024.3408683
Marco Cotogni;Marco Arazzi;Claudio Cusano
{"title":"PhotoStyle60: A Photographic Style Dataset for Photo Authorship Attribution and Photographic Style Transfer","authors":"Marco Cotogni;Marco Arazzi;Claudio Cusano","doi":"10.1109/TMM.2024.3408683","DOIUrl":"10.1109/TMM.2024.3408683","url":null,"abstract":"Photography, like painting, allows artists to express themselves through their unique style. In digital photography, this is achieved not only with the choice of the subject and the composition but also by means of post-processing operations. The automatic identification of a photographer from the style of a photo is a challenging task, for many reasons, including the lack of suitable datasets including photos taken by a diverse panel of photographers with a clear photographic style. In this paper we present PhotoStyle60, a new dataset including 5708 photographs from 60 professional and semi-professional photographers. Additionally, we selected a reduced version of the dataset, called PhotoStyle10 containing images from 10 clearly distinguishable experts. We designed the dataset to address two tasks in particular: photo authorship attribution and photographic style transfer. In the former, we conducted an extensive analysis of the dataset through several classification experiments. In the latter, we explored the potential of our dataset to transfer a photographer's style to images from the Five-K dataset. Additionally, we propose also a simple but effective multi-image style transfer method that uses multiple samples of the target style. A user study demonstrated that such a method was able to reach accurate results, preserving the semantic content of the source photograph with very few artifacts.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10573-10584"},"PeriodicalIF":8.4,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DEER: Distribution Divergence-based Graph Contrast for Partial Label Learning on Graphs DEER:基于分布发散的图形对比,用于图形上的部分标签学习
IF 7.3 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-05-31 DOI: 10.1109/tmm.2024.3408038
Yiyang Gu, Zihao Chen, Yifang Qin, Zhengyang Mao, Zhiping Xiao, Wei Ju, Chong Chen, Xian-Sheng Hua, Yifan Wang, Xiao Luo, Ming Zhang
{"title":"DEER: Distribution Divergence-based Graph Contrast for Partial Label Learning on Graphs","authors":"Yiyang Gu, Zihao Chen, Yifang Qin, Zhengyang Mao, Zhiping Xiao, Wei Ju, Chong Chen, Xian-Sheng Hua, Yifan Wang, Xiao Luo, Ming Zhang","doi":"10.1109/tmm.2024.3408038","DOIUrl":"https://doi.org/10.1109/tmm.2024.3408038","url":null,"abstract":"","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"126 1","pages":""},"PeriodicalIF":7.3,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141193254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信