2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)最新文献

筛选
英文 中文
Individual HRTF Prediction Based on Anthropometric Data and Multi-Stage Model 基于人体测量数据和多阶段模型的个体HRTF预测
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00060
Yinliang Qiu, Zhiyu Li, Jing Wang
{"title":"Individual HRTF Prediction Based on Anthropometric Data and Multi-Stage Model","authors":"Yinliang Qiu, Zhiyu Li, Jing Wang","doi":"10.1109/ICMEW59549.2023.00060","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00060","url":null,"abstract":"Getting individual head related transfer function (HRTF) is an important step in rendering binaural immersive audio. Individual HRTF can provide a more realistic experience than general HRTF. For more accurate prediction results, we propose a multi-stage model perform individual HRTF prediction based on anthropometric data. This model can combine global and local features through different stages. In the first stage, light gradient boosting machine(LightGBM) is chosen as decision tress model to predict HRTF according to anthropometric data and different angels. In the second stage, Transformer encoder is chosen to learn the global information between different frequency points. According to the experimental results, the effect of using a multi-stage model is better than that of a single model. The spectral distortion of the results predicted by our model is smaller, which can illustrate the effectiveness of our model.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130947772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blind Quality Assessment of Point Clouds Based on 3D Co-Occurrence Statistics 基于三维共生统计的点云盲测质量评估
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00084
Souheib Riache, M. Larabi, Mohamed Deriche
{"title":"Blind Quality Assessment of Point Clouds Based on 3D Co-Occurrence Statistics","authors":"Souheib Riache, M. Larabi, Mohamed Deriche","doi":"10.1109/ICMEW59549.2023.00084","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00084","url":null,"abstract":"While there has been considerable progress in quality assessment for various types of media, evaluating the quality of point clouds remains a major challenge due to the complexity of the associated applications and the nature of the content. To address this issue, this paper proposes a novel point cloud quality assessment metric based on 3D co-occurrence statistics. The proposed approach involves a voxelization strategy, where the concept of a co-occurrence matrix is extended to 3D to compute the occurrence of a pair of voxels in the 26 possible directions. Selected Haralick features are then computed and concatenated based on the selected color space. A regression step is used to map the features to the ground truth, which is represented by the subjective scores associated with the point cloud models. Experimental results show the effectiveness of using 3D cooccurrence statistics for point cloud quality assessment (CO-PCQA). The proposed metric outperforms most of the recent full-reference and no-reference quality metrics reported in the literature.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130729375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Low Light Video Enhancement Based on Improved Retinex Algorithms 基于改进Retinex算法的高效弱光视频增强
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00094
Sung-Ling Lee, Shih-Hsuan Yang
{"title":"Efficient Low Light Video Enhancement Based on Improved Retinex Algorithms","authors":"Sung-Ling Lee, Shih-Hsuan Yang","doi":"10.1109/ICMEW59549.2023.00094","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00094","url":null,"abstract":"Videos shot in low-light environments suffer from low contrast and high noise. In this paper, an improved zero-reference low-light enhancement technique for videos based on the Retinex model is presented. The proposed method improves the existing Retinex approaches in several aspects. First, the image features extracted by the VGG network are employed as a part of the input to the generator of the Retinex parameters for increasing temporal stability. Second, a deformable convolution kernel is used to enhance the spatial correlation. Third, the optical flow between frames is approximated as a combination of affine linear transformations for reducing complexity. Compared with the state-of-the-art low-light enhancement algorithms, the proposed method achieves more favorable and stable image qualities in PSNR and SSIM with short processing time.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126344099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Knowledge Graphs for CheapFakes Detection: Beyond Dataset Evaluation 利用知识图谱进行CheapFakes检测:超越数据集评估
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00024
Minh-Son Dao, K. Zettsu
{"title":"Leveraging Knowledge Graphs for CheapFakes Detection: Beyond Dataset Evaluation","authors":"Minh-Son Dao, K. Zettsu","doi":"10.1109/ICMEW59549.2023.00024","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00024","url":null,"abstract":"The proliferation of the internet and the availability of vast amounts of information have given rise to a critical and pressing issue of fake news. Among the various forms of fake news, cheapfakes are particularly prominent in deceiving people. Existing research on cheapfakes detection has primarily focused on analyzing the context and correlation between textual and visual information, but has largely overlooked the significance of external knowledge. As a result, most previous approaches, apart from the baseline of ICME‘23 Grand Challenge on Detecting Cheapfakes, have heavily relied on evaluating the dataset itself to improve performance. However, despite achieving impressive results on public test datasets, these approaches often suffer from poor performance in real-world scenarios due to their overreliance on the given dataset. In this study, we propose a novel approach that utilizes knowledge graphs to address the issue of insufficient information from external knowledge. Unlike previous approaches, our proposal does not directly alter or participate in the public test dataset to enhance performance, which can potentially result in significant overfitting. Our proposed approach achieved an accuracy score of 83.52% on Task 1, surpassing the baseline by 1.7%, and an accuracy score of 84% on Task 2, outperforming the best result from the previous challenge by 8%.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122553494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multimodal Approach for Evaluating Algal Bloom Severity Using Deep Learning 利用深度学习评估藻华严重程度的多模态方法
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00097
Fei Zhao, Chengcui Zhang, Sheikh Abujar
{"title":"A Multimodal Approach for Evaluating Algal Bloom Severity Using Deep Learning","authors":"Fei Zhao, Chengcui Zhang, Sheikh Abujar","doi":"10.1109/ICMEW59549.2023.00097","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00097","url":null,"abstract":"Harmful algal blooms (HABs) can have detrimental impacts on aquatic ecosystems, human health, and the economy. This paper presents a novel multimodal deep learning approach for assessing the severity levels of HABs, which will help to take necessary measures to mitigate the negative impacts. Unlike the other SOTA methods, the proposed method leverages three modalities: satellite image, elevation, and temperature data, to capture algal information. In particular, it utilizes an Attention-UNet-based encoder for satellite and elevation data, and a BiL-STM encoder for temperature data, to extract effective feature embeddings from respective modalities. In addition, we propose a geometric mean-based multimodal focal loss that modulates loss contributions of different modalities as a function of the confidence of different modalities. Our approach outperforms the SOTA unimodal and ensemble methods on tick-tick bloom (TTB) dataset, achieving a region-averaged root mean squared error (RA-RMSE) score of 0.8165.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114134022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Models from Computer Vision to Natural Language Processing for Cheapfakes Detection 从计算机视觉到自然语言处理的多模型廉价假货检测
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00023
Thanh-Son Nguyen, Minh-Triet Tran
{"title":"Multi-Models from Computer Vision to Natural Language Processing for Cheapfakes Detection","authors":"Thanh-Son Nguyen, Minh-Triet Tran","doi":"10.1109/ICMEW59549.2023.00023","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00023","url":null,"abstract":"Cheapfakes can compromise the integrity of information and erode trust in multimedia content, making their detection critical. Identifying Out of Context misuse of media is essential to prevent the spread of misinformation and to ensure that news and information are presented accurately and ethically. In this paper, we focus our efforts on Task 1 of the Grand Challenge on Detecting Cheapfakes in ICME2023, which involves detecting triplets consisting of an image and two captions as Out of Context. We propose a new robust approach for detecting Cheapfakes, which are instances of image reuse with different captions. Our proposed approach leverages multi-models in Computer vision and Natural language processing, such as Named entity recognition, Image captioning, and Natural language inference. In our experiments, the proposed multi-models method achieves an impressive accuracy of 78.6%, the highest accuracy among the candidates on the hidden test set. Overall, our approach demonstrates a promising solution for detecting Cheapfakes and safeguarding the integrity of multimedia content. Our source code is public on https://github.com/thanhson28/icme2023.git.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132252099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of Physical Phenomena in Golf Swing 高尔夫挥杆物理现象分析
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00052
Sheng-Kai Chen, Tzu-Yu Liu, Yan-Di Liu, H. Shih
{"title":"Analysis of Physical Phenomena in Golf Swing","authors":"Sheng-Kai Chen, Tzu-Yu Liu, Yan-Di Liu, H. Shih","doi":"10.1109/ICMEW59549.2023.00052","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00052","url":null,"abstract":"The rapid development of artificial intelligence in recent years has resulted in increasing application in sports. Meanwhile, people's interest in watching golf tournaments has increased. This study aims to detect golfers' posture, clubs, clubhead, and analyze each swing stage. We detect the club position when swinging and draw its trajectory; then, we use the double pendulum system and the Lagrangian equation to explain the physical phenomena in the swing phase. This will not only improve the training quality of players, but also enable golfers to use data analytic methods to determine their swing problems in real time. The swing speed could also be improved by considering the physical characteristics for players and coaches.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differential Melody Generation Based on Time Series Prediction 基于时间序列预测的差分旋律生成
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00067
Xiang Xu, Wei Zhong, Yi Zou, Long Ye, Qin Zhang
{"title":"Differential Melody Generation Based on Time Series Prediction","authors":"Xiang Xu, Wei Zhong, Yi Zou, Long Ye, Qin Zhang","doi":"10.1109/ICMEW59549.2023.00067","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00067","url":null,"abstract":"Long-term melody generation may encounter the challenges such as inadequate melodic variation, resulting in monotony or unreasonable melodic variation. In this work, we introduce the time series prediction and propose a method of Music-FED to generate more creative and harmonic melodies. The proposed approach adopts first-order difference to describe the melodic relative motion, and designs a temporal music representation that makes the model more easily aware of the temporal hierarchy of notes. It then learns the distribution of melody motion variation with time series prediction-based model in a non-autoregressive manner. The objective and subjective evaluations demonstrate that the proposed Music-FED can generate pop music melody with high harmony and rich contents to a certain extent.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115279898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial-Temporal Consistency Refinement Network for Dynamic Point Cloud Frame Interpolation 动态点云帧插值的时空一致性细化网络
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00080
Lancao Ren, Lili Zhao, Zhuoqun Sun, Zhipeng Zhang, Jianwen Chen
{"title":"Spatial-Temporal Consistency Refinement Network for Dynamic Point Cloud Frame Interpolation","authors":"Lancao Ren, Lili Zhao, Zhuoqun Sun, Zhipeng Zhang, Jianwen Chen","doi":"10.1109/ICMEW59549.2023.00080","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00080","url":null,"abstract":"Point cloud frame interpolation aims to improve the frame rate of a point cloud sequence by synthesising intermediate frames between consecutive frames. Most of the existing works only use the scene flow or features, not fully exploring their local geometry context or temporal correlation, which results in inaccurate local structural details or motion estimation. In this paper, we organically combine scene flows and features to propose a two-stage network based on residual-learning, which can generate spatially and temporally consistent interpolated frames. At the Stage 1, we propose the spatial-temporal warping module to effectively integrate multi-scale local and global spatial features and temporal correlation into a fusion feature, and then transform it into a coarse interpolated frame. At the Stage 2, we introduce the residual-learning structure to conduct spatial-temporal consistency refinement. A temporal-aware feature aggregation module is proposed, which can facilitate the network adaptively adjusting the contributions of spatial features from input frames, and predict the point-wise offset as the compensations due to coarse estimation errors. The experimental results demonstrate our method achieves the state-of-the-art performance on most benchmarks with various interpolated modes. Code is available at https://github.com/renlancao/SR-Net.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123358361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video-Based Point Cloud Compression Using Density-Based Variable Size Hexahedrons 基于密度的可变大小六面体的视频点云压缩
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-07-01 DOI: 10.1109/ICMEW59549.2023.00032
Faranak Tohidi, M. Paul
{"title":"Video-Based Point Cloud Compression Using Density-Based Variable Size Hexahedrons","authors":"Faranak Tohidi, M. Paul","doi":"10.1109/ICMEW59549.2023.00032","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00032","url":null,"abstract":"The most advanced standard for compressing dynamic point clouds, a recently created media format, is video-based point cloud compression (V-PCC). V-PCC relies on creating patches and converting 3D input into 2D frames to apply existing video coding. However, creating patches according to the normals and packing irregular projected patches produce many unoccupied pixels in 2D frames, affecting the temporal prediction and inefficiency of 2D video coding. In addition, unoccupied pixels increase the size of 2D frames, accordingly, the required bitrate. This paper introduces variable-sized hexahedrons segmentation as an alternative to patch creation to reduce the number of unoccupied pixels in the 2D frames. Furthermore, different areas of a point cloud are treated differently according to the density of those areas resulting in capturing points more accurately. This paper investigates combinations of different hexahedrons' sizes, and the experimental results demonstrate that the proposed method with the appropriate hexahedrons' sizes outperforms V-PCC.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128704252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信