2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)最新文献

筛选
英文 中文
FLNet: Graph Constrained Floor Layout Generation FLNet:图形约束地板布局生成
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859350
Abhinav Upadhyay, Alpana Dubey, Veenu Arora, Mani Suma Kuriakose, Shaurya Agarawal
{"title":"FLNet: Graph Constrained Floor Layout Generation","authors":"Abhinav Upadhyay, Alpana Dubey, Veenu Arora, Mani Suma Kuriakose, Shaurya Agarawal","doi":"10.1109/ICMEW56448.2022.9859350","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859350","url":null,"abstract":"In this work, we propose a generative-based approach, FLNet, to synthesize floor layout plans guided by user constraints. Our approach considers user inputs in the form of boundary, room types, and spatial relationships and generates the layout design satisfying these requirements. We evaluated our approach on floor plans data, RPLAN, consisting of 80,000 vector-graphics floor plans of residential buildings designed by professional architects. We perform both qualitative and quantitative analysis along three metrics - Layout generation accuracy, Realism, and Quality to evaluate the generated layout designs. We compare our approach with the existing baselines and outperform on all these metrics. The layout designs generated by our approach are more realistic and of better quality.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131634148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
3D-DSPnet: Product Disassembly Sequence Planning 3D-DSPnet:产品拆卸顺序规划
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859434
Abhinav Upadhyay, Bharat Ladrecha, Alpana Dubey, Suma Mani Kuriakose, P. Goenka
{"title":"3D-DSPnet: Product Disassembly Sequence Planning","authors":"Abhinav Upadhyay, Bharat Ladrecha, Alpana Dubey, Suma Mani Kuriakose, P. Goenka","doi":"10.1109/ICMEW56448.2022.9859434","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859434","url":null,"abstract":"Product Disassembly has become an area of active research as it supports sustainable development by aiding effective end-of-life (EOL) stage strategies like reuse, re-manufacturing, recycling, etc. In this work, we propose a new approach, 3D-DSPNet, that can utilize 3D data from CAD assembly models to generate a feasible disassembly sequence. Our approach uses Graph-based learning to process the graph representation of CAD models. Currently, the available 3D CAD model datasets lack ground truth disassembly sequences. We propose and curate a new dataset, the 3D-DSP dataset, which includes ground truth information about the disassembly sequence for 3D product models. We carry out evaluation and analysis of results to explain the efficacy of the proposed method. Our approach significantly outperforms the existing baseline. We develop an Autodesk Fusion 360 plug-in that generates disassembly sequence animation, allowing intuitive analysis of the disassembly plan.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129324515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CDTNET: Cross-Domain Transformer Based on Attributes for Person Re-Identification CDTNET:基于属性的跨域人员再识别转换器
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859330
Mengyuan Guan, Suncheng Xiang, Ting Liu, Yuzhuo Fu
{"title":"CDTNET: Cross-Domain Transformer Based on Attributes for Person Re-Identification","authors":"Mengyuan Guan, Suncheng Xiang, Ting Liu, Yuzhuo Fu","doi":"10.1109/ICMEW56448.2022.9859330","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859330","url":null,"abstract":"Unsupervised Domain Adaptation (UDA) Person reidentification (ReID) strives towards fine-tuning the model trained on a labelled source-domain dataset to a target-domain dataset, which has grown by leaps and bounds due to the advancement of deep convolution neural network (CNN). However, traditional CNN-based methods mainly focus on learning small discriminative features in local pedestrian region, which fails to exploit the potential of rich structural patterns and suffers from information loss on details caused by convolution operators. To tackle the challenge, this work attempts to exploit the valuable fine-grained attributes based on Transformers. Inspired by this, we propose a Cross-Domain Transformer network CDTnet to enhance the robust feature learning in connection with pedestrian attributes. As far as we are aware, we are among the first attempt to adopt a pure transformer for cross-domain ReID research. All-inclusive experiments conducted on several ReID benchmarks demonstrate that our method can reach a comparable yield with reference to the state-of-the-arts.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116945360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CPS: Full-Song and Style-Conditioned Music Generation with Linear Transformer 使用线性变压器的全歌曲和风格条件音乐生成
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859286
Weipeng Wang, Xiaobing Li, Cong Jin, Di Lu, Qingwen Zhou, Tie Yun
{"title":"CPS: Full-Song and Style-Conditioned Music Generation with Linear Transformer","authors":"Weipeng Wang, Xiaobing Li, Cong Jin, Di Lu, Qingwen Zhou, Tie Yun","doi":"10.1109/ICMEW56448.2022.9859286","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859286","url":null,"abstract":"Many deep music generation algorithms have recently been able to produce good-sounding music, but there have been few studies on controlled generation. In this process, the human sense of participation is usually very weak, and it is difficult to integrate one’s own musical motivation into the creation. In this study, we will introduce CPS (Compound word with style), a model that can specify a target style and generate a complete musical composition from scratch. We first added the genre meta-information to the music representation and distinguished it from other low-level music representations, thus strengthening the influence of the control signal. We modeled with the linear transformer, while used an adaptive strategy with different settings for different types of music tokens to reduce the probability of disharmonic music. The experiments show that, when compared to the baseline model, our model performs better in terms of basic music metrics as well as metrics for evaluating controlled ability.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131095293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fire and Gun Detection Based on Sematic Embeddings 基于语义嵌入的火力和火炮检测
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859303
Yunbin Deng, Ryan Campbell, Piyush Kumar
{"title":"Fire and Gun Detection Based on Sematic Embeddings","authors":"Yunbin Deng, Ryan Campbell, Piyush Kumar","doi":"10.1109/ICMEW56448.2022.9859303","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859303","url":null,"abstract":"It is critical that real-time gun and fire detection from video be accurate to protect life, property and the environment. Recent advances in deep machine learning have greatly improved detection accuracy in this domain. In this paper, a semantic embedding-based method is developed for zero-shot gun and fire detection. Using a pre-trained Contrastive Language-Image Pre-Training (CLIP) model, input images and arbitrary texts can be mapped to semantic vectors and their similarity can be computed. By defining object classes using the semantic vector of each classes’ description, highly accurate object detection accuracy can be achieved without training any new model. Evaluation of this method on public domain FireNet and IMFDB datasets demonstrates fire and gun detection accuracy of 99.8% and 97.3%, respectively, which significantly outperforms state of the art FireNet and you look only once (YOLO) algorithms. Semantic embedding enables open set semantic search in video and simplifies deploying and maintaining object detection applications.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131248700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Bottleneck Detection in Crowded Video Scenes Utilizing Lagrangian Motion Analysis Via Density and Arc Length Measures 基于密度和弧长测量的拉格朗日运动分析在拥挤视频场景中的瓶颈检测
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859348
Maik Simon, Erik Bochinski, Markus Küchhold, T. Sikora
{"title":"Bottleneck Detection in Crowded Video Scenes Utilizing Lagrangian Motion Analysis Via Density and Arc Length Measures","authors":"Maik Simon, Erik Bochinski, Markus Küchhold, T. Sikora","doi":"10.1109/ICMEW56448.2022.9859348","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859348","url":null,"abstract":"Bottleneck situations can occur in overcrowded areas such as entrances or narrowed passages and are associated with a great danger to the life and health of involved people. The automated detection of such bottlenecks is the first crucial step to mitigate these dangers. In this work, we utilize the dynamics of motions using the Lagrangian approach from the analysis of dynamic systems to analyze profiles of groups of people. The derived features, which are observed by the long-term dependent motion dynamics, are described by two-dimensional Lagrangian fields. We extend the underlying Lagrangian framework by a novel measure to capture the density of motion and hence people in the context of crowd analysis. Further, we show how this novel density measure can be combined with the established arc length measure for the detection of bottlenecks in videos. Experimental evaluations show a 5% improvement over the state-of-the-art for spatiotemporal bottleneck detection.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114488150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Decentralized Federated Learning with Enhanced Privacy Preservation 增强隐私保护的去中心化联邦学习
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859507
Sheng-Po Tseng, Jan-Yue Lin, Wei-Chien Cheng, L. Yeh, Chih-Ya Shen
{"title":"Decentralized Federated Learning with Enhanced Privacy Preservation","authors":"Sheng-Po Tseng, Jan-Yue Lin, Wei-Chien Cheng, L. Yeh, Chih-Ya Shen","doi":"10.1109/ICMEW56448.2022.9859507","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859507","url":null,"abstract":"We present a decentralized federated learning (FL) framework based on blockchain. In traditional federated learning, it is necessary that a third-party centralized server aggregates all the gradients which participant in the upload, but such a trusted third-party may not always exist. We address this issue with the decentralized blockchain and encrypt the neural network model parameters and gradients.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133676947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Surveillance Video Anomaly Detection with Feature Enhancement and Consistency Frame Prediction 基于特征增强和一致性帧预测的监控视频异常检测
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859414
Beiji Zou, Min Wang, Lingzi Jiang, Yue Zhang, Shu Liu
{"title":"Surveillance Video Anomaly Detection with Feature Enhancement and Consistency Frame Prediction","authors":"Beiji Zou, Min Wang, Lingzi Jiang, Yue Zhang, Shu Liu","doi":"10.1109/ICMEW56448.2022.9859414","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859414","url":null,"abstract":"Surveillance video anomaly detection is a challenging problem because of the diversity of abnormal events. The current prediction-based methods outperform reconstruction-based methods. But the former has the following issues: 1) Using optical flow to represent motion will affect real-time detection. 2) Distinguishing abnormal events only by local relationships will lead to ambiguity. 3) Semantic information and spatiotemporal constraint are not fully utilized. To address these problems, we propose FECP-Net: a network with feature enhancement and consistency frame prediction for surveillance video anomaly detection. We use the RGB difference between consecutive frames rather than optical flow to realize real-time detection. Meanwhile, we design a feature enhancement module to enrich semantics and global context information in features. In addition, we add spatiotemporal consistency constraint and consistency loss to strengthen consistency predictions. Extensive experiments on standard benchmarks demonstrate the effectiveness of our method.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129972457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Augmentation for Efficient Self-Supervised Visual Representation Learning 基于多增强的高效自监督视觉表征学习
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859465
Van-Nhiem Tran, Chi-En Huang, Shenyao Liu, Kai-Lin Yang, Timothy Ko, Yung-Hui Li
{"title":"Multi-Augmentation for Efficient Self-Supervised Visual Representation Learning","authors":"Van-Nhiem Tran, Chi-En Huang, Shenyao Liu, Kai-Lin Yang, Timothy Ko, Yung-Hui Li","doi":"10.1109/ICMEW56448.2022.9859465","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859465","url":null,"abstract":"In recent years, self-supervised learning has been studied to deal with the limitation of available labeled-dataset. Among the major components of self-supervised learning, the data augmentation pipeline is one key factor in enhancing the resulting performance. However, most researchers manually designed the augmentation pipeline, and the limited collections of transformation may cause the lack of robustness of the learned feature representation. In this work, we proposed Multi-Augmentations for Self-Supervised Representation Learning (MA-SSRL), which fully searched for various augmentation policies to build the entire pipeline to improve the robustness of the learned feature representation. MA-SSRL successfully learns the invariant feature representation and presents an efficient, effective, and adaptable data augmentation pipeline for self-supervised pre-training on different distribution and domain datasets. MA-SSRL outperforms the previous state-of-the-art methods on transfer and semi-supervised benchmarks while requiring fewer training epochs. Code available on GitHub1.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129542332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GolfPose: Golf Swing Analyses with a Monocular Camera Based Human Pose Estimation GolfPose:高尔夫挥杆分析与单目相机为基础的人体姿态估计
2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-07-18 DOI: 10.1109/ICMEW56448.2022.9859415
Zhongyu Jiang, Haorui Ji, Samuel Menaker, Jenq-Neng Hwang
{"title":"GolfPose: Golf Swing Analyses with a Monocular Camera Based Human Pose Estimation","authors":"Zhongyu Jiang, Haorui Ji, Samuel Menaker, Jenq-Neng Hwang","doi":"10.1109/ICMEW56448.2022.9859415","DOIUrl":"https://doi.org/10.1109/ICMEW56448.2022.9859415","url":null,"abstract":"With the rapid developments of computer vision and deep learning technologies, artificial intelligence takes a more and more important role in sports analyses. In this paper, to attain the objective of automated golf swing analyses, we propose a lightweight temporal-based 2D human pose estimation (HPE) method, called GolfPose, which achieves improved performance than the state-of-the-art image-based HPE methods. Unlike traditional image-based methods, our temporal-based method, designed for efficient and effective golf swing analyses, takes advantage of the temporal information to improve the estimation accuracy of fast-moving and partially self-occluded keypoints. Furthermore, in order to make sure the golf swing analyses can run on mobile devices, we optimize the model architecture to achieve real-time inference. With around 10% of the parameters and half of the GFLOPs used in the state-of-the-art HRNet, our proposed GolfPose model can achieve 9.16 mean pixel error (MPE) in our golf swing dataset, compared with 9.20 MPE for HRNet. Furthermore, the proposed temporal-based method, facilitated with golf club detection(GCD), significantly improves the accuracy of keypoints on the golf club from 13.98 to 9.21 MPE.","PeriodicalId":106759,"journal":{"name":"2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129730450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信