Proceedings of the 19th International Conference on Content-based Multimedia Indexing最新文献

筛选
英文 中文
Hyperspectral Image Reconstruction of Heritage Artwork Using RGB Images and Deep Neural Networks 基于RGB图像和深度神经网络的文物艺术品高光谱图像重建
Ailin Chen, R. Jesus, M. Vilarigues
{"title":"Hyperspectral Image Reconstruction of Heritage Artwork Using RGB Images and Deep Neural Networks","authors":"Ailin Chen, R. Jesus, M. Vilarigues","doi":"10.1145/3549555.3549583","DOIUrl":"https://doi.org/10.1145/3549555.3549583","url":null,"abstract":"The application of our research is in the art world where the scarcity of available analytical data from a particular artist or physical access for its acquisition is restricted. This poses a fundamental problem for the purpose of conservation, restoration or authentication of historical artworks. We address part of this problem by providing a practical method to generate hyperspectral data from readily available RGB imagery of artwork by means of a two-step process using deep neural networks. The particularities of our approach include the generation of learnable colour mixtures and reflectances from a reduced collection of prior data for the mapping and reconstruction of hyperspectral features on new images. Further analysis and correction of the prediction are achieved by a second network that reduces the error by producing results akin to those obtained by a hyperspectral camera. Our method has been used to study a collection of paintings by Amadeo de Souza-Cardoso where successful results were obtained.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115522669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skin Cancer Detection using Ensemble Learning and Grouping of Deep Models 基于深度模型集成学习和分组的皮肤癌检测
Takfarines Guergueb, M. Akhloufi
{"title":"Skin Cancer Detection using Ensemble Learning and Grouping of Deep Models","authors":"Takfarines Guergueb, M. Akhloufi","doi":"10.1145/3549555.3549584","DOIUrl":"https://doi.org/10.1145/3549555.3549584","url":null,"abstract":"Melanoma remains the most dangerous form of skin cancer which has a high mortality rate. When detect early, melanoma can be easily cured and millions of lives might be saved. The use of automatic detection models in clinical decision support can increase the ability to address this issue and improve survival rates. In this work, we proposed an automated pipeline for melanoma detection, which combines the predictions of deep convolutional neural network models through ensemble learning techniques. Furthermore, our automated pipeline includes various strategies such as image augmentation, upsampling, image cropping, digital hair removal and class weighting. Our pipeline was trained and tested using the image data acquired from the Society for Imaging Informatics in Medicine and the International Skin Imaging Collaboration SIIM-ISIC 2020. Our proposed pipeline has demonstrated a high performance compared to the other state-of-the-art pipelines for melanoma disease prediction with an accuracy of 97.77% and an AUC of 98.47%.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116775724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Wildfire Segmentation using Deep-RegSeg Semantic Segmentation Architecture 基于Deep-RegSeg语义分割架构的野火分割
Rafik Ghali, M. Akhloufi, Wided Souidène Mseddi, Marwa Jmal
{"title":"Wildfire Segmentation using Deep-RegSeg Semantic Segmentation Architecture","authors":"Rafik Ghali, M. Akhloufi, Wided Souidène Mseddi, Marwa Jmal","doi":"10.1145/3549555.3549586","DOIUrl":"https://doi.org/10.1145/3549555.3549586","url":null,"abstract":"Wildfires are a worldwide natural risk, which causes harmful effects to human safety and leads to ecological and economical damage. Various fire detection systems have been proposed in order to detect fire and reduce its effects. However, they are still limited in detecting small fire areas and determining the precise fire’s shape. In order to overcome these limitations, we present, in this paper, a novel method based on deep learning, called ‘Deep-RegSeg’, to segment fire pixels and detect fire areas in complex non-structured environments. Deep-RegSeg is evaluated with varying backbone and loss function. The obtained results showed a high performance and outperformed some recent state-of-the-art techniques. The results also proved that Deep-RegSeg is efficient in segmenting wildfire pixels and detecting the precise fire’s shape, especially small fire areas under various conditions of weather, presence of smoke, and environment brightness.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114871263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Analysis of the Complementarity of Latent and Concept Spaces for Cross-Modal Video Search 跨模态视频搜索中潜在空间和概念空间的互补性分析
Varsha Devi, P. Mulhem, G. Quénot
{"title":"Analysis of the Complementarity of Latent and Concept Spaces for Cross-Modal Video Search","authors":"Varsha Devi, P. Mulhem, G. Quénot","doi":"10.1145/3549555.3549600","DOIUrl":"https://doi.org/10.1145/3549555.3549600","url":null,"abstract":"This paper focuses on studying the complementarity between the spaces from hybrid cross-modal state-of-the-art systems for video retrieval like [5]. We aim at investigating if these spaces really convey different features, or if they are representing the same things. We use PCA (Principal Component Analysis) to study the optimal dimensions, CCA (Canonical Correlation Analysis) to assess the similarity of the spaces, and check if such approach is in fact similar to ensemble learning. We achieve experiments on the MST-VTT corpus, and show that in fact these two spaces are indeed very similar, paving the way for new models that could enforce more dissimilar spaces.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"250 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114252392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to Detect Fallen People in Virtual Worlds 学习在虚拟世界中发现堕落的人
F. Carrara, Lorenzo Pasco, C. Gennaro, F. Falchi
{"title":"Learning to Detect Fallen People in Virtual Worlds","authors":"F. Carrara, Lorenzo Pasco, C. Gennaro, F. Falchi","doi":"10.1145/3549555.3549573","DOIUrl":"https://doi.org/10.1145/3549555.3549573","url":null,"abstract":"Falling is one of the most common causes of injury in all ages, especially in the elderly, where it is more frequent and severe. For this reason, a tool that can detect a fall in real time can be helpful in ensuring appropriate intervention and avoiding more serious damage. Some approaches available in the literature use sensors, wearable devices, or cameras with special features such as thermal or depth sensors. In this paper, we propose a Computer Vision deep-learning based approach for human fall detection based on largely available standard RGB cameras. A typical limitation of this kind of approaches is the lack of generalization to unseen environments. This is due to the error generated during human detection and, more generally, due to the unavailability of large-scale datasets that specialize in fall detection problems with different environments and fall types. In this work, we mitigate these limitations with a general-purpose object detector trained using a virtual world dataset in addition to real-world images. Through extensive experimental evaluation, we verified that by training our models on synthetic images as well, we were able to improve their ability to generalize. Code to reproduce results is available at https://github.com/lorepas/fallen-people-detection.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131080096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning Co-occurrence Features Across Spatial and Temporal Domains for Hand Gesture Recognition 手势识别的时空共现特征学习
Mohammad Rehan, H. Wannous, Jafar Alkheir, Kinda Aboukassem
{"title":"Learning Co-occurrence Features Across Spatial and Temporal Domains for Hand Gesture Recognition","authors":"Mohammad Rehan, H. Wannous, Jafar Alkheir, Kinda Aboukassem","doi":"10.1145/3549555.3549591","DOIUrl":"https://doi.org/10.1145/3549555.3549591","url":null,"abstract":"Hand gesture is the most natural modality for human-machine interaction and its recognition can be considered one of the most complicated and interesting challenges for computer vision community. In recent years, there has been a noticeable advancement in the field of machine learning and computer vision. However, providing a hand gesture recognition system robust enough to work in real-time applications remains challenging. Dynamic hand gestures can be seen as variations in shape or movement during hand motion and often both together. To tackle these challenges, we propose a dynamic hand gesture recognition approach based on hand skeletal sequences. In particular, we introduce a simple but effective deep network architecture to deal with Spatio-temporal co-occurrence features computed on 3D coordinates of hand joints along the gesture sequence. Experimental results show that our approach outperforms state-of-the-art methods on two public datasets, First Person Hand Action and SHREC’2017, with an efficient time computational model compared to most existing approaches.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125718282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Toolchain for Extracting and Visualising Road Traffic Data 道路交通数据提取和可视化工具链
H. Neuschmied, Florian Krebs, Stefan Ladstätter, Elisabeth Eder, Mohamed Redouane Berrazouane, G. Thallinger
{"title":"A Toolchain for Extracting and Visualising Road Traffic Data","authors":"H. Neuschmied, Florian Krebs, Stefan Ladstätter, Elisabeth Eder, Mohamed Redouane Berrazouane, G. Thallinger","doi":"10.1145/3549555.3549580","DOIUrl":"https://doi.org/10.1145/3549555.3549580","url":null,"abstract":"We demonstrate a toolchain for visualising detailed road traffic data from multimodal sensors consisting of (i) the automatic, real-time extraction of movement paths of road users and noteworthy events from traffic monitoring cameras or LIDAR sensors, (ii) extraction of audio events from individual microphones and microphone arrays, (iii) a spatial data managment system storing the extracted information together with a geographic information system (GIS), and (iv) a web based viewer allowing to interactively visualise all these data in the context of a high-definition digital twin of the traffic environment. This system enables the collection of a considerable amount of objective data on road use and can be used for planning changes to traffic facilities as well as for assessing changes in the traffic environment.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130032597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Virtual Reality Talking Avatar for Investigative Interviews of Maltreat Children 用于虐待儿童调查访谈的虚拟现实说话化身
Syed Zohaib Hassan, Pegah Salehi, M. Riegler, M. Johnson, G. Baugerud, Pål Halvorsen, S. Sabet
{"title":"A Virtual Reality Talking Avatar for Investigative Interviews of Maltreat Children","authors":"Syed Zohaib Hassan, Pegah Salehi, M. Riegler, M. Johnson, G. Baugerud, Pål Halvorsen, S. Sabet","doi":"10.1145/3549555.3549572","DOIUrl":"https://doi.org/10.1145/3549555.3549572","url":null,"abstract":"Interviews conducted with the maltreated children are often the primary source of evidence in prosecution. Many alleged incidents of abuse are not prosecuted because the children’s testimony is collected in an unreliable way. Research shows the consistent poor quality of these interviews and highlights the need for better training of Child Protection Services (CPS) and police personnel who interview abused child witnesses. The currently available systems for training of CPS and police personnel are developed in a rigid way that lag behind in generating dynamic responses. Moreover, these systems require human input such as employing an actor mimicking a child or an operator controlling prerecorded child responses during the interactions. This paper demonstrates the prototype of an interview training program with an artificial intelligent Child Avatar in Virtual Reality (VR), enabling CPS and police personnel to practice interviewing with abused children. The program is developed using Unity game engine and artificial intelligence-based technologies such as dialogue models, talking visual avatars, text-to-speech, and speech-to-text components.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115301188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Hybrid Transformer Network for Deepfake Detection 用于深度伪造检测的混合变压器网络
Sohail Ahmed Khan, Duc-Tien Dang-Nguyen
{"title":"Hybrid Transformer Network for Deepfake Detection","authors":"Sohail Ahmed Khan, Duc-Tien Dang-Nguyen","doi":"10.1145/3549555.3549588","DOIUrl":"https://doi.org/10.1145/3549555.3549588","url":null,"abstract":"Deepfake media is becoming widespread nowadays because of the easily available tools and mobile apps which can generate realistic looking deepfake videos/images without requiring any technical knowledge. With further advances in this field of technology in the near future, the quantity and quality of deepfake media is also expected to flourish, while making deepfake media a likely new practical tool to spread mis/disinformation. Because of these concerns, the deepfake media detection tools are becoming a necessity. In this study, we propose a novel hybrid transformer network utilizing early feature fusion strategy for deepfake video detection. Our model employs two different CNN networks, i.e., (1) XceptionNet and (2) EfficientNet-B4 as feature extractors. We train both feature extractors along with the transformer in an end-to-end manner on FaceForensics++, DFDC benchmarks. Our model, while having relatively straightforward architecture, achieves comparable results to other more advanced state-of-the-art approaches when evaluated on FaceForensics++ and DFDC benchmarks. Besides this, we also propose novel face cut-out augmentations, as well as random cut-out augmentations. We show that the proposed augmentations improve the detection performance of our model and reduce overfitting. In addition to that, we show that our model is capable of learning from considerably small amount of data.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116174730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Analysing the Memorability of a Procedural Crime-Drama TV Series, CSI 程序犯罪剧《犯罪现场调查》的记忆性分析
Sean Cummins, Lorin Sweeney, A. Smeaton
{"title":"Analysing the Memorability of a Procedural Crime-Drama TV Series, CSI","authors":"Sean Cummins, Lorin Sweeney, A. Smeaton","doi":"10.1145/3549555.3549592","DOIUrl":"https://doi.org/10.1145/3549555.3549592","url":null,"abstract":"We investigate the memorability of a 5-season span of a popular crime-drama TV series, CSI, through the application of a vision transformer fine-tuned on the task of predicting video memorability. By investigating the popular genre of crime-drama TV through the use of a detailed annotated corpus combined with video memorability scores, we show how to extrapolate meaning from the memorability scores generated on video shots. We perform a quantitative analysis to relate video shot memorability to a variety of aspects of the show. The insights we present in this paper illustrate the importance of video memorability in applications which use multimedia in areas like education, marketing, indexing, as well as in the case here namely TV and film production.","PeriodicalId":191591,"journal":{"name":"Proceedings of the 19th International Conference on Content-based Multimedia Indexing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122110815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信