Proceedings of the 19th International Conference on Content-based Multimedia Indexing最新文献_第2页

A domain adaptive deep learning solution for scanpath prediction of paintings 用于绘画扫描路径预测的领域自适应深度学习解决方案

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549597

M. A. Kerkouri, M. Tliba, A. Chetouani, A. Bruno

{"title":"A domain adaptive deep learning solution for scanpath prediction of paintings","authors":"M. A. Kerkouri, M. Tliba, A. Chetouani, A. Bruno","doi":"10.1145/3549555.3549597","DOIUrl":"https://doi.org/10.1145/3549555.3549597","url":null,"abstract":"Cultural heritage understanding and preservation is an important issue for society as it represents a fundamental aspect of its identity. Paintings represent a significant part of cultural heritage, and are the subject of study continuously. However, the way viewers perceive paintings is strictly related to the so-called HVS (Human Vision System) behaviour. This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings. In further details, we introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans, including the fundamental understanding of a scene, and then extend it to painting images. The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers’ attention. We use an FCNN (Fully Convolutional Neural Network), in which we exploit a differentiable channel-wise selection and Soft-Argmax modules. We also incorporate learnable Gaussian distributions onto the network bottleneck to simulate visual attention process bias in natural scene images. Furthermore, to reduce the effect of shifts between different domains (i.e. natural images, painting), we urge the model to learn unsupervised general features from other domains using a gradient reversal classifier. The results obtained by our model outperform existing state-of-the-art ones in terms of accuracy and efficiency.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129397431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Streaming learning with Move-to-Data approach for image classification 基于Move-to-Data方法的流学习图像分类

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549590

Abel Kahsay Gebreslassie, J. Benois-Pineau, A. Zemmari

{"title":"Streaming learning with Move-to-Data approach for image classification","authors":"Abel Kahsay Gebreslassie, J. Benois-Pineau, A. Zemmari","doi":"10.1145/3549555.3549590","DOIUrl":"https://doi.org/10.1145/3549555.3549590","url":null,"abstract":"In Deep Neural Network training, the availability of a large amount of representative training data is the sine qua non-condition for a good generalization capacity of the model. In many real-world applications, data is not available at a glance, but coming on the fly. If a pre-trained model is fine-tuned on the new data, then catastrophic forgetting happens mostly. Incremental learning mechanisms propose ways to overcome catastrophic forgetting. Streaming learning is a type of incremental learning where models learn from new data instances as soon as they become available in a single training pass. In this work, we conduct an experimental study, on a large dataset, of an incremental/streaming learning method Move-to-Data we previously proposed, and propose an updated approach by ”re-targeting” with gradient descent which is faster than the popular streaming learning method ExStream. The method achieves better performances and computational efficiency compared to ExStream. Move-to-Data with gradient is on average 3.5 times faster than ExStream and has a similar accuracy, with 0.5% improvement compared to ExStream.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124401797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

BiasUNet: Learning Change Detection over Sentinel-2 Image Pairs BiasUNet: Sentinel-2图像对的学习变化检测

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549574

Maria Pegia, A. Moumtzidou, Ilias Gialampoukidis, Björn þór Jónsson, S. Vrochidis, Y. Kompatsiaris

引用次数: 0

A survey for image based methods in construction: from images to digital twins 基于图像的建筑方法综述:从图像到数字孪生

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549594

I. Koulalis, Nikolaos I. Dourvas, Theocharis Triantafyllidis, K. Ioannidis, S. Vrochidis, Y. Kompatsiaris

引用次数: 3

Urban Image Geo-Localization Using Open Data on Public Spaces 基于公共空间开放数据的城市影像地理定位研究

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549589

Mathias Glistrup, S. Rudinac, Björn þór Jónsson

引用次数: 0

Improving Nearest Neighbor Indexing by Multitask Learning 多任务学习改进最近邻索引

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549579

Amorntip Prayoonwong, Ke Zeng, Chih-Yi Chiu

{"title":"Improving Nearest Neighbor Indexing by Multitask Learning","authors":"Amorntip Prayoonwong, Ke Zeng, Chih-Yi Chiu","doi":"10.1145/3549555.3549579","DOIUrl":"https://doi.org/10.1145/3549555.3549579","url":null,"abstract":"In the task of approximate nearest neighbor search, the conventional lookup-table indexing calculates the distances (or similarities) between the query and codewords, and then re-ranks the data points associated with the nearest (or the most similar) codewords. To address the codeword quantization loss problem exhibited in the conventional method, the probability-based indexing leverages the data distribution among codewords learned by neural networks to locate the nearest neighbor [8]. In this paper, we present a multitasking model to improve the probability-based indexing method. The model is formulated by two objectives of NN distribution probabilities and data retrieval quantity. The NN distribution probabilities are an estimation to determine the possible codewords where the nearest neighbor may be associated. The candidate retrieval quantity specifies the prediction for the least number of codewords to be re-ranked for capturing the nearest neighbor. The proposed model is then trained by minimizing triplet loss, probability loss, and quantity loss. By learning these tasks in parallel, we find the predictions for both data distribution probability and data retrieval quantity are more accurate, so that search accuracy and computation efficiency can be improved together. We experiment on two billion-scale benchmark datasets to evaluate the proposed method and compare with several approximate nearest neighbor search methods, and the results demonstrate the outperformance of the proposed method.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121933515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Relational Database Performance for Multimedia: A Case Study 多媒体关系型数据库性能:案例研究

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549558

Björn þór Jónsson, Aaron Duane, Nikolaj Mertz

引用次数: 1

Real-time deblurring network for face AR applications 面部增强现实应用的实时去模糊网络

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549577

Juhwan Lee, Jonghan Lee, S. Yoo

{"title":"Real-time deblurring network for face AR applications","authors":"Juhwan Lee, Jonghan Lee, S. Yoo","doi":"10.1145/3549555.3549577","DOIUrl":"https://doi.org/10.1145/3549555.3549577","url":null,"abstract":"Deblurring is a problem that has been studied for a long time. Extant works have primarily focused on deblurring real-world images. However, face images are different from real-world images. Because face images have fewer textures and weaker edges than real-world images, the deblurring of real-world images focuses on restoring the overall texture of the image; however, restoring the particular face structure (e.g., eyes, nose, and ears) is essential for face images. Recently, a convolutional neural network(CNN)-based deblurring network has been proposed. There are various types of CNN-based deblurring networks. Recently, multiscale architecture has been widely used; however, these types of networks need large amounts of resources. Further, because of the multitude of parameters, it requires a significant amount of time for inference. In this study, we developed a end-to-end network for face image deblurring, wherein novel CNN-based feature attention (FA) blocks are adopted, and a low inference time is achieved. Moreover, discrete Fourier transform (DFT) is employed for high-quality deblurring. FA blocks combine channel attention layer and pixel attention layer for feature extraction. The spectrum obtained using DFT is used as a loss function by comparing the ground truth image with the deblurring image. Experimental results show that the ours network is comparable to other deblurring networks in terms of performance as indicated by the PSNR, SSIM. Moreover we also demonstrated performance improvement by measuring the mean Intersection over Union (mIoU) of the deblurred image using a face-segmentation network.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132810197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Fine Grained Quality Assessment of Video Anomaly Detection 视频异常检测的细粒度质量评估

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549569

Jiang Zhou, Kevin McGuinness, Joseph Antony, Noel E O 'connor

{"title":"A Fine Grained Quality Assessment of Video Anomaly Detection","authors":"Jiang Zhou, Kevin McGuinness, Joseph Antony, Noel E O 'connor","doi":"10.1145/3549555.3549569","DOIUrl":"https://doi.org/10.1145/3549555.3549569","url":null,"abstract":"In this paper we propose a new approach to assess the performance of video anomaly detection algorithms. Inspired by the COCO metrics we propose a quartile based quality assessment of video anomaly detection to have a detailed breakdown of algorithm performance. The proposed assessment divides the detection into five categories based on the measurement quartiles of the position, scale and motion magnitude of anomalies. A weighted precision is introduced in the average precision calculation such that the frame-level average precision reported in categories can be compared to each other regardless of the baseline of the precision-recall curve in every category. We evaluated three video anomaly detection approaches, including supervised and unsupervised approaches, on five public datasets using the proposed approach. Our evaluation shows that the anomaly scale introduces performance difference in detection. For both supervised and unsupervised methods evaluated, the detection achieve higher average precision for the large anomalies in scale. Our assessment also shows that the supervised multiple instance learning method is robust to the motion magnitude differences in anomalies, while the unsupervised one-class neural network method performs better than the unsupervised autoencoder reconstruction method when the motion magnitudes are small. Our experiments, however, also show that the positions of the anomalies have impact on the performance of the multiple instance learning method and the one-class neural network method but the impact on the autoencoder-based approach is negligible.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130688181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Segmenting partially annotated medical images 分割部分注释的医学图像

Proceedings of the 19th International Conference on Content-based Multimedia Indexing Pub Date : 2022-09-14 DOI: 10.1145/3549555.3549570

Nicolas Martin, J. Chevallet, G. Quénot

引用次数: 0