Proceedings of the 19th International Conference on Content-based Multimedia Indexing最新文献

筛选
英文 中文
Self-Supervised Spiking Neural Networks applied to Digit Classification 自监督脉冲神经网络在数字分类中的应用
Benjamin Chamand, P. Joly
{"title":"Self-Supervised Spiking Neural Networks applied to Digit Classification","authors":"Benjamin Chamand, P. Joly","doi":"10.1145/3549555.3549559","DOIUrl":"https://doi.org/10.1145/3549555.3549559","url":null,"abstract":"The self-supervised learning (SSL) paradigm is a rapidly growing research area in recent years with promising results, especially in the field of image processing. In order for these models to converge towards the creation of discriminative representations, a data augmentation is applied to the input data that feeds two-branch networks. On the other hand, Spiking Neural Networks (SNNs) are attracting a growing community due to their ability to process temporal information, their low-energy consumption and their high biological plausibility. Thanks to the use of Poisson process stochasticity to encode the same data into different temporal representations, and the success of using surrogate gradient on learning, we propose a self-supervised learning method applied to an SNN network, and we make a preliminary study on the generated representations. We have shown its feasibility by training our architecture on a dataset of images of digits (MNIST), then we have evaluated the representations with two classification methods.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127195700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ecological Impact Assessment Framework for areas affected by Natural Disasters 自然灾害地区生态影响评估框架
A. Setyanto, Kusrini Kusrini, G. B. Adninda, Renindya Kartikakirana, Rhisa Aidilla Suprapto, A. Laksito, I. M. D. Agastya, K. Chandramouli, A. Majlingová, Yvonne Brodrechtová, K. Demestichas, E. Izquierdo
{"title":"Ecological Impact Assessment Framework for areas affected by Natural Disasters","authors":"A. Setyanto, Kusrini Kusrini, G. B. Adninda, Renindya Kartikakirana, Rhisa Aidilla Suprapto, A. Laksito, I. M. D. Agastya, K. Chandramouli, A. Majlingová, Yvonne Brodrechtová, K. Demestichas, E. Izquierdo","doi":"10.1145/3549555.3549596","DOIUrl":"https://doi.org/10.1145/3549555.3549596","url":null,"abstract":"The forest's biodiversity consists of relations between trees, animals, the environment, and surrounding communities. Their existence required a certain balance both in number and composition. The diversity of the element itself creates a chain that connects each of the living things. Consistently, those mutual relationships are sometimes disturbed by pressures, whether man-made pressures or natural pressures. As a consequence of that event, the biodiversity loses its balance and becomes vulnerable to disaster. The fact that forest fire cases damage every living thing in the forest is becoming a massive issue in forest management. In some instances, the balance of forest biodiversity assembles an ecological resilience essential to the forest condition in combating disturbance. This paper reviews the biodiversity elements and their relationship to the extent to which elements will support ecological resilience. This is a review of 58 studies related to biodiversity balance and ecological resilience. The review discovered evidence that biodiversity components are connected and support each other. However, not every relation contributes to ecological resilience. As a result, we assess several biodiversity elements that might be useful in supporting ecological resilience, which are tree, environment, animal, and community. We also provide 2 case examples case to get the value of some biodiversity elements using a deep learning method.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129022239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Exploration into the Benefits of the CLIP model for Lifelog Retrieval 生命日志检索CLIP模型的优势探讨
Ly-Duyen Tran, Naushad Alam, Yvette Graham, L. K. Vo, N. T. Diep, Binh T. Nguyen, Liting Zhou, C. Gurrin
{"title":"An Exploration into the Benefits of the CLIP model for Lifelog Retrieval","authors":"Ly-Duyen Tran, Naushad Alam, Yvette Graham, L. K. Vo, N. T. Diep, Binh T. Nguyen, Liting Zhou, C. Gurrin","doi":"10.1145/3549555.3549593","DOIUrl":"https://doi.org/10.1145/3549555.3549593","url":null,"abstract":"In this paper, we attempt to fine-tune the CLIP (Contrastive Language-Image Pre-Training) model on the Lifelog Question Answering dataset (LLQA) to investigate retrieval performance of the fine-tuned model over the zero-shot baseline model. We train the model adopting a weight space ensembling approach using a modified loss function to take into account the differences in our dataset (LLQA) when compared with the dataset the CLIP model was originally pretrained on. We further evaluate our fine-tuned model using visual as well as multimodal queries on multiple retrieval tasks, demonstrating improved performance over the zero-shot baseline model.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127017827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Potential of Webcam Based Real Time Eye-Tracking to Reduce Rendering Cost 基于网络摄像头的实时眼球跟踪在降低渲染成本方面的潜力
Isabel Kütemeyer, M. Lux
{"title":"The Potential of Webcam Based Real Time Eye-Tracking to Reduce Rendering Cost","authors":"Isabel Kütemeyer, M. Lux","doi":"10.1145/3549555.3549595","DOIUrl":"https://doi.org/10.1145/3549555.3549595","url":null,"abstract":"Performance optimisation continues to be a relevant topic both in hardware and software development, with video games producing fully rendered images every 16 or 34 ms, depending on the desired framerate. Human observers close their eyes for about 300 ms an average of twelve times per minute, which means many frames will never be observed. This paper aimed to examine if it would be possible to reduce rendering time by detecting and skipping these unobserved frames. Blinks were identified during runtime by detecting the eye aspect ratio of the observer in low-quality web camera footage. A prototype using this method was tested on a small group of subjects to determine if footage watched this way was perceived as distracting or of lesser quality than unaltered images. Results from a questionnaire suggest that the altered footage did not impact the subjects’ opinions, with no participant reporting any visual disturbances. Because this test used video footage, skipping frames was substituted by a lower resolution render. Altered frames were rendered an average of five percent faster than their unaltered counterparts.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130571964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Few-shot Object Detection as a Semi-supervised Learning Problem 作为半监督学习问题的少镜头目标检测
W. Bailer, Hannes Fassold
{"title":"Few-shot Object Detection as a Semi-supervised Learning Problem","authors":"W. Bailer, Hannes Fassold","doi":"10.1145/3549555.3549599","DOIUrl":"https://doi.org/10.1145/3549555.3549599","url":null,"abstract":"This paper addresses the issue of dealing with few-shot learning settings in which different classes are annotated on different datasets. Each part of the data has exhaustive annotations for only one or a small set of classes, but not for others used in training. It is likely, that unannotated samples of a class exist, potentially impacting the gradient as negative samples. Because of this fact, we argue that few-shot learning is essentially a semi-supervised learning problem. We analyze how approaches from semi-supervised learning can be applied. In particular, the use of soft-sampling to weight the gradient based on overlap of detections and ground truth, and creating missing annotations using a preliminary detector are studied. The use of soft-sampling provides small but consistent improvements, at much lower computational effort than predicting additional annotations.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134230905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment analysis on 2D images of urban and indoor spaces using deep learning architectures 使用深度学习架构对城市和室内空间的二维图像进行情感分析
Konstantinos Chatzistavros, Theodora Pistola, S. Diplaris, K. Ioannidis, S. Vrochidis, Y. Kompatsiaris
{"title":"Sentiment analysis on 2D images of urban and indoor spaces using deep learning architectures","authors":"Konstantinos Chatzistavros, Theodora Pistola, S. Diplaris, K. Ioannidis, S. Vrochidis, Y. Kompatsiaris","doi":"10.1145/3549555.3549575","DOIUrl":"https://doi.org/10.1145/3549555.3549575","url":null,"abstract":"This paper focuses on the determination of the evoked sentiments to people by observing outdoor and indoor spaces, aiming to create a tool for designers and architects that can be utilized for sophisticated designs. Since sentiment is subjective, the design process can be facilitated by an ancillary automated tool for sentiment extraction. Simultaneously, a dataset containing both real and virtual images of vacant architectural spaces is introduced, while the SUN attributes are also extracted from the images in order to be included throughout training. The dataset is annotated towards both valence and arousal, while five established and two custom architectures, one which has never been used before in classifying abstract concepts, are evaluated on the collected data.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133602949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards Human Performance on Sketch-Based Image Retrieval 基于草图的图像检索的人类性能研究
Omar Seddati, S. Dupont, S. Mahmoudi, T. Dutoit
{"title":"Towards Human Performance on Sketch-Based Image Retrieval","authors":"Omar Seddati, S. Dupont, S. Mahmoudi, T. Dutoit","doi":"10.1145/3549555.3549582","DOIUrl":"https://doi.org/10.1145/3549555.3549582","url":null,"abstract":"Sketch-based image retrieval (SBIR) solutions are attracting increased interest in the field of computer vision. These solutions provide an intuitive and powerful tool to retrieve images in large-scale image databases. In this paper, we conduct a comprehensive study of classic triplet CNN training pipelines within the SBIR context. We study the impact of embeddings normalization, model sharing, margin selection, batch size, hard mining selection and the evolution of the number of hard triplets during training to propose several avenues for improvement. We also propose dropout column, an adaptation of dropout for triplet network and similar pipelines. In addition, we also introduce a novel approach to build state-of-the-art SBIR solutions that can be used with low power systems. The whole study is conducted using The Sketchy Database, a large-scale SBIR database. We carry out a series of experiments and show that adopting a few simple modifications enhances significantly existing SBIR pipelines (faster training & higher accuracy). Our study enables us to propose an enhanced pipeline that outperforms previous state-of-the-art on the Sketchy Database by a significant margin (a recall of 53.92% compared to 46.2% at k = 1) and reaches almost human performance (54.27%) on a large-scale benchmark.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128356600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Chest Diseases Classification Using CXR and Deep Ensemble Learning 基于CXR和深度集成学习的胸部疾病分类
Adnane Ait Nasser, M. Akhloufi
{"title":"Chest Diseases Classification Using CXR and Deep Ensemble Learning","authors":"Adnane Ait Nasser, M. Akhloufi","doi":"10.1145/3549555.3549581","DOIUrl":"https://doi.org/10.1145/3549555.3549581","url":null,"abstract":"Chest diseases are among the most common worldwide health problems; they are potentially life-threatening disorders which can affect organs such as lungs and heart. Radiologists typically use visual inspection to diagnose chest X-ray (CXR) diseases, which is a difficult task prone to errors. The signs of chest abnormalities appear as opacities around the affected organ, making it difficult to distinguish between diseases of superimposed organs. To this end, we propose a very first method for CXR organ disease detection using deep learning. We used an ensemble learning (EL) approach to increase the efficiency of the classification of CXR diseases by organs (lung and heart) using a consolidated dataset. This dataset contains 26,316 CXR images from VinDr-CXR and CheXpert datasets. The proposed ensemble of deep convolutional neural networks (DCNN) approach achieves excellent performance with an AUC of 0.9489 for multi-class classification, outperforming many state-of-the-art models.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128371374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A large-scale TV video and metadata database for French political content analysis and fact-checking 一个大型电视视频和元数据数据库,用于法国政治内容分析和事实核查
Frédéric Rayar, Mathieu Delalandre, Van-Hao Le
{"title":"A large-scale TV video and metadata database for French political content analysis and fact-checking","authors":"Frédéric Rayar, Mathieu Delalandre, Van-Hao Le","doi":"10.1145/3549555.3549557","DOIUrl":"https://doi.org/10.1145/3549555.3549557","url":null,"abstract":"In this paper, we introduce a large-scale multimodal publicly available dataset1 for the French political content analysis and fact-checking. This dataset consists of more than 1,200 fact-checked claims that have been scraped from a fact-checking service with associated metadata. For the video counterpart, the dataset contains nearly 6,730 TV programs, having a total duration of 6,540 hours, with metadata. These programs have been collected during the 2022 French presidential election with a dedicated workstation.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116676490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
StyleGAN-based CLIP-guided Image Shape Manipulation 基于stylegan的剪贴引导图像形状操作
Yuchen Qian, Kohei Yamamoto, Keiji Yanai
{"title":"StyleGAN-based CLIP-guided Image Shape Manipulation","authors":"Yuchen Qian, Kohei Yamamoto, Keiji Yanai","doi":"10.1145/3549555.3549556","DOIUrl":"https://doi.org/10.1145/3549555.3549556","url":null,"abstract":"In this paper, we propose a text-guided image manipulation method which focuses on editing shape attribute using text description. We combine an image generation model, StyleGAN2, and image-text matching model, CLIP, and we have achieved the goal of image shape attribute manipulation by modifying the parameters of the pretrained StyleGAN2 generator. Qualitative and quantitative evaluations are conducted to demonstrate the effectiveness of the proposed method.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121983044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信