2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)最新文献

筛选
英文 中文
Transformer Based Multimodal Scene Recognition in Soccer Videos 基于变压器的足球视频多模态场景识别
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859304
Yaozong Gan, Ren Togo, Takahiro Ogawa, M. Haseyama
{"title":"Transformer Based Multimodal Scene Recognition in Soccer Videos","authors":"Yaozong Gan, Ren Togo, Takahiro Ogawa, M. Haseyama","doi":"10.1109/ICMEW56448.2022.9859304","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859304","url":null,"abstract":"This paper presents a transformer-based multimodal soccer scene recognition method for both visual and audio modalities. Our approach directly uses the original video frames and audio spectrogram from the soccer video as the input of the transformer model, which can capture the spatial information of the action at a moment and the contextual temporal information between different actions in the soccer videos. We fuse both video frames and audio spectrogram information output from the transformer model in order to better identify scenes that occur in real soccer matches. The late fusion performs a weighted average of visual and audio estimation results to obtain complete information of a soccer scene. We evaluate the proposed method on SoccerNet-V2 dataset and confirm that our method achieves the best performance compared with the recent and state-of-the-art methods.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124999507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAL-360IQA: A Saliency Weighted Patch-Based CNN Model for 360-Degree Images Quality Assessment SAL-360IQA:一种360度图像质量评估的基于显著性加权patch的CNN模型
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859468
Abderrezzaq Sendjasni, M. Larabi
{"title":"SAL-360IQA: A Saliency Weighted Patch-Based CNN Model for 360-Degree Images Quality Assessment","authors":"Abderrezzaq Sendjasni, M. Larabi","doi":"10.1109/ICMEW56448.2022.9859468","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859468","url":null,"abstract":"Since the introduction of 360-degree images, a significant number of deep learning based image quality assessment (IQA) models have been introduced. Most of them are based on multichannel architectures where several convolutional neural networks (CNNs) are used together. Despite the competitive results, these models come with a higher cost in terms of complexity. To significantly reduce the complexity and ease the training of the CNN model, this paper proposes a patch-based scheme dedicated to 360-degree IQA. Our framework is developed including patches selection and extraction based on latitude to account for the importance of the equatorial region, data normalization, CNN-based architecture and a weighted average pooling of predicted local qualities. We evaluate the proposed model on two widely used databases and show the superiority to state-of-the-art models, even multichannel ones. Furthermore, the cross-database assessment revealed the good generalization ability, demonstrating the robustness of the proposed model.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116028312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
3DSTNet: Neural 3D Shape Style Transfer 3DSTNet:神经三维形状风格转移
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859470
Abhinav Upadhyay, Alpana Dubey, Suma Mani Kuriakose, Devasish Mahato
{"title":"3DSTNet: Neural 3D Shape Style Transfer","authors":"Abhinav Upadhyay, Alpana Dubey, Suma Mani Kuriakose, Devasish Mahato","doi":"10.1109/ICMEW56448.2022.9859470","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859470","url":null,"abstract":"In this work, we propose a 3D style transfer framework, 3DSTNet, to transfer shape or geometric properties from style to content 3D objects. We analyze the effects of multiple model hyperparameters on 3D style transfer. To evaluate the proposed 3D style transfer framework, we conduct a user study with 3D designers. Our evaluation results demonstrate that our approach effectively generates new designs and the generated designs aid in designers’ creativity.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124549340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Smileverse : A VR Experience Smileverse: VR体验
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859417
Yi-Ping Hung, Jerry Chin-Han Goh, Yuan-An Chan, Hsiao-Yuan Chin, You-Shin Tsai, Chien-Hsin Ju
{"title":"Smileverse : A VR Experience","authors":"Yi-Ping Hung, Jerry Chin-Han Goh, Yuan-An Chan, Hsiao-Yuan Chin, You-Shin Tsai, Chien-Hsin Ju","doi":"10.1109/ICMEW56448.2022.9859417","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859417","url":null,"abstract":"SmileVerse is a continuation of our previous work, Smiling Buddha [1], which aims to achieve emotional contagion. In the interactive installation of Smiling Buddha, we have designed a natural interactive process to let the smile that can be contagious. Based on the innovative VR/AI technologies, we upgraded our previous artwork to build a virtual universe and use facial trackers to detect the user’s expressions and let the virtual characters respond interactively to achieve smiles that can be contagious.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126899395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tachiegan: Generative Adversarial Networks for Tachie Style Transfer Tachiegan: Tachie风格迁移的生成对抗网络
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859510
Zihan Chen, X. Chen
{"title":"Tachiegan: Generative Adversarial Networks for Tachie Style Transfer","authors":"Zihan Chen, X. Chen","doi":"10.1109/ICMEW56448.2022.9859510","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859510","url":null,"abstract":"Tachie painting is an emerging digital portrait art form that shows a character in a standing pose. Automatic generation of a Tachie picture from a real photo would facilitate many creation tasks. However, it is non-trivial to represent Tachie’s artistic styles and establish a delicate mapping from the real-world image domain to the Tachie domain. Existing approaches generally suffer from inaccurate style transformation and severe structure distortion when applied to Tachie style transfer. In this paper, we propose the first approach for Tachie stylization of portrait photographs. Based on the unsupervised CycleGAN framework, we design two novel loss functions to emphasize lines and tones in the Tachie style. Furthermore, we design a character-enhanced stylization framework by introducing an auxiliary body mask to better preserve the global body structure. Experiment results demonstrate the robustness and better generation capability of our method in Tachie stylization from photos in a wide range of poses, even trained on a small dataset.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"38 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120897453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ICMEW 2022 Cover Page ICMEW 2022封面
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/icmew56448.2022.9859515
{"title":"ICMEW 2022 Cover Page","authors":"","doi":"10.1109/icmew56448.2022.9859515","DOIUrl":"https://doi.org/10.1109/icmew56448.2022.9859515","url":null,"abstract":"","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133543674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Topology Coding and Payload Partitioning Techniques for Neural Network Compression (NNC) Standard 神经网络压缩(NNC)标准的高效拓扑编码和有效负载划分技术
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859467
Jaakko Laitinen, Alexandre Mercat, Jarno Vanne, H. R. Tavakoli, Francesco Cricri, Emre B. Aksu, M. Hannuksela
{"title":"Efficient Topology Coding and Payload Partitioning Techniques for Neural Network Compression (NNC) Standard","authors":"Jaakko Laitinen, Alexandre Mercat, Jarno Vanne, H. R. Tavakoli, Francesco Cricri, Emre B. Aksu, M. Hannuksela","doi":"10.1109/ICMEW56448.2022.9859467","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859467","url":null,"abstract":"A Neural Network Compression (NNC) standard aims to define a set of coding tools for efficient compression and transmission of neural networks. This paper addresses the high-level syntax (HLS) of NNC and proposes three HLS techniques for network topology coding and payload partitioning. Our first technique provides an efficient way to code prune topology information. It removes redundancy in the bitmask and thereby improves coding efficiency by 4–99% over existing approaches. The second technique processes bitmasks in larger chunks instead of one bit at a time. It is shown to reduce computational complexity of NNC encoding by 63% and NNC decoding by 82%. Our third technique makes use of partial data counters to partition an NNC bitstream into uniformly sized units for more efficient data transmission. Even though the smaller partition sizes introduce some overhead, our network simulations show better throughput due to lower packet retransmission rates. To our knowledge, this the first work to address the practical implementation aspects of HLS. The proposed techniques can be seen as key enabling factors for efficient adaptation and economical deployment of the NNC standard in a plurality of next-generation industrial and academic applications.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115035964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OPSE: Online Per-Scene Encoding for Adaptive Http Live Streaming OPSE:自适应Http直播的在线逐场景编码
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859502
V. V. Menon, Hadi Amirpour, Christian Feldmann, M. Ghanbari, C. Timmerer
{"title":"OPSE: Online Per-Scene Encoding for Adaptive Http Live Streaming","authors":"V. V. Menon, Hadi Amirpour, Christian Feldmann, M. Ghanbari, C. Timmerer","doi":"10.1109/ICMEW56448.2022.9859502","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859502","url":null,"abstract":"In live streaming applications, typically a fixed set of bitrateresolution pairs (known as a bitrate ladder) is used during the entire streaming session in order to avoid the additional latency to find scene transitions and optimized bitrateresolution pairs for every video content. However, an optimized bitrate ladder per scene may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces an Online Per-Scene Encoding (OPSE) scheme for adaptive HTTP live streaming applications. In this scheme, scene transitions and optimized bitrate-resolution pairs for every scene are predicted using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features. Experimental results show that, on average, OPSEyields bitrate savings of up to 48.88% in certain scenes to maintain the same VMAF, compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"27 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120854096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
No-Reference Light Field Image Quality Assessment Method Based on a Long-Short Term Memory Neural Network 基于长短期记忆神经网络的无参考光场图像质量评价方法
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859419
Sana Alamgeer, Mylène C. Q. Farias
{"title":"No-Reference Light Field Image Quality Assessment Method Based on a Long-Short Term Memory Neural Network","authors":"Sana Alamgeer, Mylène C. Q. Farias","doi":"10.1109/ICMEW56448.2022.9859419","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859419","url":null,"abstract":"Light Field (LF) cameras capture angular and spatial information and, consequently, require a large amount of resources in memory and bandwidth. To reduce these requirements, LF contents generally need to undergo compression and transmission protocols. Since these techniques may introduce distortions, the design of Light-Field Image Quality Assessment (LFI-IQA) methods are important to monitor the quality of the LFI content at the user side. In this work, we present a No-Reference (NR) LFIIQA method that is based on a Long Short-Term Memory based Deep Neural Network (LSTM-DNN). The method is composed of two streams. The first stream extracts long-term dependent distortion related features from horizontal epipolar plane images, while the second stream processes bottleneck features of micro-lens images. The outputs of both streams are fused, and supplied to a regression operation that generates a scalar value as a predicted quality score. Results show that the proposed method is robust and accurate, outperforming several state-of-the-art LF-IQA methods.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128494271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Persong: Multi-Modality Driven Music Recommendation System 个人:多模态驱动的音乐推荐系统
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859488
Haonan Cheng, Xiaoying Huang, Ruyu Zhang, Long Ye
{"title":"Persong: Multi-Modality Driven Music Recommendation System","authors":"Haonan Cheng, Xiaoying Huang, Ruyu Zhang, Long Ye","doi":"10.1109/ICMEW56448.2022.9859488","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859488","url":null,"abstract":"In this work, we develop PerSong, a music recommendation system that can recommend personalised songs based on the user’s current status. First, multi-modal physiological signals, namely visual and heart rate, are collected and combined to construct multi-level temporal sequences. Then, we propose a Global-Local Similarity Function (GLSF) based music recommendation algorithm to establish a mapping between the user’s current state and the music. Our demonstrations have attended a quite number of exhibitions and shown remarkable performance under diverse circumstances. We have made the core of our work publicly available: https://github.com/yrz7991/GLSF/tree/master.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127159979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信