Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing最新文献

Fuzzy Logic Visual Network (FLVN): A neuro-symbolic approach for visual features matching 模糊逻辑视觉网络(FLVN):一种视觉特征匹配的神经符号方法

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2023-07-29 DOI: 10.48550/arXiv.2307.16019

Francesco Manigrasso, L. Morra, F. Lamberti

{"title":"Fuzzy Logic Visual Network (FLVN): A neuro-symbolic approach for visual features matching","authors":"Francesco Manigrasso, L. Morra, F. Lamberti","doi":"10.48550/arXiv.2307.16019","DOIUrl":"https://doi.org/10.48550/arXiv.2307.16019","url":null,"abstract":"Neuro-symbolic integration aims at harnessing the power of symbolic knowledge representation combined with the learning capabilities of deep neural networks. In particular, Logic Tensor Networks (LTNs) allow to incorporate background knowledge in the form of logical axioms by grounding a first order logic language as differentiable operations between real tensors. Yet, few studies have investigated the potential benefits of this approach to improve zero-shot learning (ZSL) classification. In this study, we present the Fuzzy Logic Visual Network (FLVN) that formulates the task of learning a visual-semantic embedding space within a neuro-symbolic LTN framework. FLVN incorporates prior knowledge in the form of class hierarchies (classes and macro-classes) along with robust high-level inductive biases. The latter allow, for instance, to handle exceptions in class-level attributes, and to enforce similarity between images of the same class, preventing premature overfitting to seen classes and improving overall performance. FLVN reaches state of the art performance on the Generalized ZSL (GZSL) benchmarks AWA2 and CUB, improving by 1.3% and 3%, respectively. Overall, it achieves competitive performance to recent ZSL methods with less computational overhead. FLVN is available at https://gitlab.com/grains2/flvn.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"61 1","pages":"456-467"},"PeriodicalIF":0.0,"publicationDate":"2023-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83857928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sparse Double Descent in Vision Transformers: real or phantom threat? 视觉变形金刚的稀疏双下降:真实的还是虚幻的威胁?

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2023-07-26 DOI: 10.48550/arXiv.2307.14253

Victor Qu'etu, Marta Milovanović, Enzo Tartaglione

{"title":"Sparse Double Descent in Vision Transformers: real or phantom threat?","authors":"Victor Qu'etu, Marta Milovanović, Enzo Tartaglione","doi":"10.48550/arXiv.2307.14253","DOIUrl":"https://doi.org/10.48550/arXiv.2307.14253","url":null,"abstract":"Vision transformers (ViT) have been of broad interest in recent theoretical and empirical works. They are state-of-the-art thanks to their attention-based approach, which boosts the identification of key features and patterns within images thanks to the capability of avoiding inductive bias, resulting in highly accurate image analysis. Meanwhile, neoteric studies have reported a ``sparse double descent'' phenomenon that can occur in modern deep-learning models, where extremely over-parametrized models can generalize well. This raises practical questions about the optimal size of the model and the quest over finding the best trade-off between sparsity and performance is launched: are Vision Transformers also prone to sparse double descent? Can we find a way to avoid such a phenomenon? Our work tackles the occurrence of sparse double descent on ViTs. Despite some works that have shown that traditional architectures, like Resnet, are condemned to the sparse double descent phenomenon, for ViTs we observe that an optimally-tuned $ell_2$ regularization relieves such a phenomenon. However, everything comes at a cost: optimal lambda will sacrifice the potential compression of the ViT.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"413 1","pages":"490-502"},"PeriodicalIF":0.0,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74977615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Not with my name! Inferring artists' names of input strings employed by Diffusion Models 不能用我的名字!推断扩散模型使用的输入字符串的艺术家名称

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2023-07-25 DOI: 10.48550/arXiv.2307.13527

R. Leotta, O. Giudice, Luca Guarnera, S. Battiato

{"title":"Not with my name! Inferring artists' names of input strings employed by Diffusion Models","authors":"R. Leotta, O. Giudice, Luca Guarnera, S. Battiato","doi":"10.48550/arXiv.2307.13527","DOIUrl":"https://doi.org/10.48550/arXiv.2307.13527","url":null,"abstract":"Diffusion Models (DM) are highly effective at generating realistic, high-quality images. However, these models lack creativity and merely compose outputs based on their training data, guided by a textual input provided at creation time. Is it acceptable to generate images reminiscent of an artist, employing his name as input? This imply that if the DM is able to replicate an artist's work then it was trained on some or all of his artworks thus violating copyright. In this paper, a preliminary study to infer the probability of use of an artist's name in the input string of a generated image is presented. To this aim we focused only on images generated by the famous DALL-E 2 and collected images (both original and generated) of five renowned artists. Finally, a dedicated Siamese Neural Network was employed to have a first kind of probability. Experimental results demonstrate that our approach is an optimal starting point and can be employed as a prior for predicting a complete input string of an investigated image. Dataset and code are available at: https://github.com/ictlab-unict/not-with-my-name .","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"7 1","pages":"364-375"},"PeriodicalIF":0.0,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74764905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle Components CarPatch:汽车零部件辐射场评价的综合基准

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2023-07-24 DOI: 10.48550/arXiv.2307.12718

Davide Di Nucci, A. Simoni, Matteo Tomei, L. Ciuffreda, R. Vezzani, R. Cucchiara

{"title":"CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle Components","authors":"Davide Di Nucci, A. Simoni, Matteo Tomei, L. Ciuffreda, R. Vezzani, R. Cucchiara","doi":"10.48550/arXiv.2307.12718","DOIUrl":"https://doi.org/10.48550/arXiv.2307.12718","url":null,"abstract":"Neural Radiance Fields (NeRFs) have gained widespread recognition as a highly effective technique for representing 3D reconstructions of objects and scenes derived from sets of images. Despite their efficiency, NeRF models can pose challenges in certain scenarios such as vehicle inspection, where the lack of sufficient data or the presence of challenging elements (e.g. reflections) strongly impact the accuracy of the reconstruction. To this aim, we introduce CarPatch, a novel synthetic benchmark of vehicles. In addition to a set of images annotated with their intrinsic and extrinsic camera parameters, the corresponding depth maps and semantic segmentation masks have been generated for each view. Global and part-based metrics have been defined and used to evaluate, compare, and better characterize some state-of-the-art techniques. The dataset is publicly released at https://aimagelab.ing.unimore.it/go/carpatch and can be used as an evaluation guide and as a baseline for future work on this challenging topic.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"424 1","pages":"99-110"},"PeriodicalIF":0.0,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84940569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised Video Anomaly Detection with Diffusion Models Conditioned on Compact Motion Representations 基于紧凑运动表征的扩散模型的无监督视频异常检测

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2023-07-04 DOI: 10.48550/arXiv.2307.01533

Anil Osman Tur, Nicola Dall’Asen, C. Beyan, E. Ricci

{"title":"Unsupervised Video Anomaly Detection with Diffusion Models Conditioned on Compact Motion Representations","authors":"Anil Osman Tur, Nicola Dall’Asen, C. Beyan, E. Ricci","doi":"10.48550/arXiv.2307.01533","DOIUrl":"https://doi.org/10.48550/arXiv.2307.01533","url":null,"abstract":"This paper aims to address the unsupervised video anomaly detection (VAD) problem, which involves classifying each frame in a video as normal or abnormal, without any access to labels. To accomplish this, the proposed method employs conditional diffusion models, where the input data is the spatiotemporal features extracted from a pre-trained network, and the condition is the features extracted from compact motion representations that summarize a given video segment in terms of its motion and appearance. Our method utilizes a data-driven threshold and considers a high reconstruction error as an indicator of anomalous events. This study is the first to utilize compact motion representations for VAD and the experiments conducted on two large-scale VAD benchmarks demonstrate that they supply relevant information to the diffusion model, and consequently improve VAD performances w.r.t the prior art. Importantly, our method exhibits better generalization performance across different datasets, notably outperforming both the state-of-the-art and baseline methods. The code of our method is available at https://github.com/AnilOsmanTur/conditioned_video_anomaly_diffusion","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"1 1","pages":"49-62"},"PeriodicalIF":0.0,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73275631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Eye Diseases Classification Using Deep Learning 使用深度学习的眼病分类

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2023-03-20 DOI: 10.1007/978-3-031-06427-2_14

Patrycja Haraburda, Lukasz Dabala

引用次数: 1

Forecasting Future Instance Segmentation with Learned Optical Flow and Warping 利用学习光流和翘曲预测未来的实例分割

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-11-15 DOI: 10.1007/978-3-031-06433-3_30

Andrea Ciamarra, Federico Becattini, Lorenzo Seidenari, A. Bimbo

引用次数: 2

Deep Autoencoders for Anomaly Detection in Textured Images Using CW-SSIM 基于CW-SSIM的纹理图像异常检测深度自编码器

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-08-30 DOI: 10.1007/978-3-031-06430-2_56

Andrea Bionda, Luca Frittoli, G. Boracchi

引用次数: 1

LDD: A Grape Diseases Dataset Detection and Instance Segmentation LDD:葡萄病害数据集检测与实例分割

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-06-21 DOI: 10.1007/978-3-031-06430-2_32

L. Rossi, M. Valenti, S. Legler, A. Prati

引用次数: 2

Prediction of fish location by combining fisheries data and sea bottom temperature forecasting 结合渔业资料及海底温度预测鱼类位置

Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing Pub Date : 2022-05-04 DOI: 10.48550/arXiv.2205.02107

Matthieu Ospici, Klaas Sys, Sophie Guegan-Marat

引用次数: 1