Proceedings of the 2023 8th International Conference on Multimedia and Image Processing最新文献

Detail-Preserving Video-based Virtual Try-On (DPV-VTON) 基于细节保留视频的虚拟试戴(DPV-VTON)

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599599

Raghav S K, Jahnavi A B, Vivek S D, Kirtan T S, P. Agarwal

{"title":"Detail-Preserving Video-based Virtual Try-On (DPV-VTON)","authors":"Raghav S K, Jahnavi A B, Vivek S D, Kirtan T S, P. Agarwal","doi":"10.1145/3599589.3599599","DOIUrl":"https://doi.org/10.1145/3599589.3599599","url":null,"abstract":"Virtual Try-on systems enable the try-on of a desired clothing on a target person image. These systems have led to vast research and have attracted commercial interest. However, the existing techniques are image-based systems limited to using an in-shop target clothing from a pre-defined dataset. To address this, we propose a video-based virtual try-on network DPV-VTON, that simulates the try-on using the target cloth extracted from the fashion videos on a target person image, while preserving the details and the characteristics. The core of the DPV-VTON pipeline is made up of (i) Best Frame Selection (BFS) module that extracts the best frame from the video (ii) Clothing Extraction module (CEM) extracts the target clothing from the selected best frame and generates a binary mask. (iii) A virtual try-on module synthesizes a final virtual try-on. Experiments on the existing benchmark datasets and a curated video dataset demonstrate that DPV-VTON generates photo-realistic and visually promising results. The proposed model obtains the lowest FID, LPIPS and the highest SSIM scores compared to the existing systems.","PeriodicalId":123753,"journal":{"name":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122090255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Study of intracranial haematoma localisation based on improved RetinaNet 基于改进视网膜网的颅内血肿定位研究

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599601

Junyuan Cheng, Kai Gao, Lixiang Zhou

{"title":"Study of intracranial haematoma localisation based on improved RetinaNet","authors":"Junyuan Cheng, Kai Gao, Lixiang Zhou","doi":"10.1145/3599589.3599601","DOIUrl":"https://doi.org/10.1145/3599589.3599601","url":null,"abstract":"Intracranial haemorrhage is described as bleeding within the skull. It is a serious cranio-cerebral disorder recognized for its high mortality and lethality rate, which usually requires urgent follow-up diagnosis and determination of the location and subtype of intracranial hemorrhagic lesions.In this study, we experimented with multiple available deep learning architectures to localize the location of hemorrhagic lesions after traumatic brain injury (ICH). To improve the probability of successful patient resuscitation. In this paper, we propose an improved model based on RetinaNet. The accuracy problem of lesion localisation is not effeactively addressed due to the complex structure of the lesion location in intracranial haemorrhage and the large variation in the morphology of the lesion for different subtypes. To address these problems, the paper then proceeds to optimise the original RetinaNet model in terms of its feature extraction network structure, training techniques and Anchor settings. Through comparison experiments, it can be found that the improved model is better than the three target detection models, Faster R-CNN, RetinaNet and YOLOv4.","PeriodicalId":123753,"journal":{"name":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127121145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design and Implementation of Medical Ultrasound Image Processing System based on MATLAB GUI 基于MATLAB GUI的医学超声图像处理系统的设计与实现

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599600

Shiyan Zheng, Haohan Zhang, H. Zhao, Bowei Zhang, Yu Zhou, Chuang Han

{"title":"Design and Implementation of Medical Ultrasound Image Processing System based on MATLAB GUI","authors":"Shiyan Zheng, Haohan Zhang, H. Zhao, Bowei Zhang, Yu Zhou, Chuang Han","doi":"10.1145/3599589.3599600","DOIUrl":"https://doi.org/10.1145/3599589.3599600","url":null,"abstract":"Due to the physical characteristics of ultrasonic imaging, there are many factors in the process of imaging, which lead to the low quality of imaging images. There may be artifacts, noise interference, unclear edge contour of diseased tissue, and other problems. This paper designs and implements an image processing system for medical ultrasound images based on MATLAB GUI. The system realizes the functions of image enhancement, image segmentation, image filtering, edge detection, and morphological processing of medical ultrasound images. Through the detection of breast duct ultrasound images, the noise interference is greatly reduced in the processed ultrasound images compared with the original images. In addition, there is an obvious highlighting effect on the ultrasound images of some typical lesions, which makes the detailed information of the images more obvious and the boundaries of the lesions clearer. The processed images were compared with the original images by subjective evaluation. The evaluation results of professional doctors all show that the treatment method in this paper can greatly improve the readability of medical ultrasound images.","PeriodicalId":123753,"journal":{"name":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114911100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Arbitrary Style Transfer with Multiple Self-Attention 多重自我关注下的任意风格转移

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599605

Yuzhu Song, Li Liu, Huaxiang Zhang, Dongmei Liu, Hongzhen Li

引用次数: 0

Multimodal Emotion Detection based on Visual and Thermal Source Fusion 基于视觉和热源融合的多模态情感检测

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599590

Peixin Tian, Dehu Li, Dong Zhang

引用次数: 0

Single-station multi-view global calibration based on the concentric circle 3D target 基于同心圆三维目标的单站多视点全局标定

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599604

Pengfei Sun, Fuqiang Zhou, Haishu Tan

引用次数: 0

Recognition and Detection of UAV Based on Transfer Learning 基于迁移学习的无人机识别与检测

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599591

J. Liu, Feng Zhang, Hao Zhao, Qi De Lu, Bing Feng, Lichang Feng

{"title":"Recognition and Detection of UAV Based on Transfer Learning","authors":"J. Liu, Feng Zhang, Hao Zhao, Qi De Lu, Bing Feng, Lichang Feng","doi":"10.1145/3599589.3599591","DOIUrl":"https://doi.org/10.1145/3599589.3599591","url":null,"abstract":"With the increasing application scenarios of UAVs in industry, agriculture, military and other fields, the potential threats to national security and public security cannot be ignored. In addition, effective UAV detection and/or tracking is becoming an increasingly important security service. This paper integrates deep learning and image processing technology to conduct research in this context. In this paper, a transfer learning based UAV detection model (YOLOV5-UAV) is proposed. In order to reduce the influence of the amount of supervised data and the imbalance of target distribution on the performance of the model, the dataset is constructed based on self-shot videos and Internet downloaded videos in different natural scenes, combined with Mosaic data enhancement and adaptive scaling techniques. Therefore, the problem of data security is also effectively solved. Furthermore, real-time tests were carried out in two different time periods, namely day and night, from multiple scales, multiple perspectives and multiple natural scenes, for purpose of verifying the validity of the model. The applicability of different detection models is compared and analyzed for small target, moving background and weak contrast between UAV and background. The results show that YOLOV5-UAV model has a good performance in both detection accuracy and detection speed.","PeriodicalId":123753,"journal":{"name":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117338121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Simulating Quantum Turing Machine in Augmented Reality 增强现实中的量子图灵机模拟

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599606

Wanwan Li

引用次数: 1

Reconstruction of hyperspectral images with compressed sensing based on linear mixing model and affinity propagation clustering algorithm 基于线性混合模型和亲和传播聚类算法的压缩感知高光谱图像重建

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599602

Youli zou, Zhi-yun Xiao, Kuntao Ye

{"title":"Reconstruction of hyperspectral images with compressed sensing based on linear mixing model and affinity propagation clustering algorithm","authors":"Youli zou, Zhi-yun Xiao, Kuntao Ye","doi":"10.1145/3599589.3599602","DOIUrl":"https://doi.org/10.1145/3599589.3599602","url":null,"abstract":"The increasing spatial and spectral resolution of hyperspectral images results in a significant rise in data volume, which poses a challenge for data storage and transmission. Therefore, improving the efficiency of storage and transmission by enhancing the reconstruction performance of hyperspectral images at low sampling rates or same sampling rates conditions is a crucial topic in compressed sensing. Previous research has shown that a linear mixing model and distributed compressed sensing method outperform traditional compressed sensing reconstruction algorithms in recovering original data. However, the low estimating accuracy of both the endmembers matrix and abundance matrix due to the random selection of reference bands limits the reconstruction performance. To address this problem, we proposed a compressed sensing reconstruction algorithm based on a linear mixing model and affinity propagation clustering algorithm. Our method improves reconstruction performance by enhancing the estimating accuracy of the endmembers and abundance matrices. During the sampling stage, the affinity propagation clustering algorithm is used to group the spectral bands according to the spectral correlation of hyperspectral images, where the clustering center serving as the reference band and the other bands as non-reference bands. During the reconstruction stage, the number of endmembers from the reference band is estimated fist, and the endmembers matrix and the abundance matrix are then estimated. Finally, the endmembers matrix and estimated abundance matrix are used for reconstruction. Experimental results show that our proposed algorithm achieves higher performance in reconstructing hyperspectral images than the linear mixing model-based distributed compressed sensing method.","PeriodicalId":123753,"journal":{"name":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125431466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast Recognition of Distributed Fiber Optic Vibration Sensing Signal based on Machine Vision in High-speed Railway Security 高速铁路安防中基于机器视觉的分布式光纤振动传感信号快速识别

Proceedings of the 2023 8th International Conference on Multimedia and Image Processing Pub Date : 2023-04-21 DOI: 10.1145/3599589.3599603

Nachuan Yang, Yongjun Zhao, Fuqiang Wang

{"title":"Fast Recognition of Distributed Fiber Optic Vibration Sensing Signal based on Machine Vision in High-speed Railway Security","authors":"Nachuan Yang, Yongjun Zhao, Fuqiang Wang","doi":"10.1145/3599589.3599603","DOIUrl":"https://doi.org/10.1145/3599589.3599603","url":null,"abstract":"Accurate and effective identification of multi-vibration events detected based on the phase-sensitive optical time-domain reflectometer (Φ-OTDR) is an effective method to achieve precise alarm. This study proposes a real-time classification method of Φ-OTDR multi-vibration events based on the combination of convolutional neural network (CNN), bi-directional long short-term memory network (Bi-LSTM) and connectionist temporal classification (CTC), which can quickly and effectively identify the type and number of vibrations contained in the data image when multiple vibration signals are present in a single image, and manual alignment is not required for model training. Noncoherent integration and pulse cancellers are used for raw signal processing to generate spatio-temporal images. CNN is used to extract spatial dimensional features in spatio-temporal images, Bi-LSTM extracts temporal dimensional correlation features, and the hybrid features are automatically aligned with the labels by CTC. A dataset of 8,000 vibration images containing 17,589 abnormal vibration events is collected for model training, validation and testing. Experiments show that the recognition model C3B3 trained with this method can achieve 210 FPS and 99.62% F1 score on the test set. The system can achieve the real-time classification of multiple vibration targets at the perimeter of high-speed railway and effectively reduce the false alarm rate of the system.","PeriodicalId":123753,"journal":{"name":"Proceedings of the 2023 8th International Conference on Multimedia and Image Processing","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126732946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0