Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing最新文献

筛选
英文 中文
Fuzzy Logic Visual Network (FLVN): A neuro-symbolic approach for visual features matching 模糊逻辑视觉网络(FLVN):一种视觉特征匹配的神经符号方法
Francesco Manigrasso, L. Morra, F. Lamberti
{"title":"Fuzzy Logic Visual Network (FLVN): A neuro-symbolic approach for visual features matching","authors":"Francesco Manigrasso, L. Morra, F. Lamberti","doi":"10.48550/arXiv.2307.16019","DOIUrl":"https://doi.org/10.48550/arXiv.2307.16019","url":null,"abstract":"Neuro-symbolic integration aims at harnessing the power of symbolic knowledge representation combined with the learning capabilities of deep neural networks. In particular, Logic Tensor Networks (LTNs) allow to incorporate background knowledge in the form of logical axioms by grounding a first order logic language as differentiable operations between real tensors. Yet, few studies have investigated the potential benefits of this approach to improve zero-shot learning (ZSL) classification. In this study, we present the Fuzzy Logic Visual Network (FLVN) that formulates the task of learning a visual-semantic embedding space within a neuro-symbolic LTN framework. FLVN incorporates prior knowledge in the form of class hierarchies (classes and macro-classes) along with robust high-level inductive biases. The latter allow, for instance, to handle exceptions in class-level attributes, and to enforce similarity between images of the same class, preventing premature overfitting to seen classes and improving overall performance. FLVN reaches state of the art performance on the Generalized ZSL (GZSL) benchmarks AWA2 and CUB, improving by 1.3% and 3%, respectively. Overall, it achieves competitive performance to recent ZSL methods with less computational overhead. FLVN is available at https://gitlab.com/grains2/flvn.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"61 1","pages":"456-467"},"PeriodicalIF":0.0,"publicationDate":"2023-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83857928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse Double Descent in Vision Transformers: real or phantom threat? 视觉变形金刚的稀疏双下降:真实的还是虚幻的威胁?
Victor Qu'etu, Marta Milovanović, Enzo Tartaglione
{"title":"Sparse Double Descent in Vision Transformers: real or phantom threat?","authors":"Victor Qu'etu, Marta Milovanović, Enzo Tartaglione","doi":"10.48550/arXiv.2307.14253","DOIUrl":"https://doi.org/10.48550/arXiv.2307.14253","url":null,"abstract":"Vision transformers (ViT) have been of broad interest in recent theoretical and empirical works. They are state-of-the-art thanks to their attention-based approach, which boosts the identification of key features and patterns within images thanks to the capability of avoiding inductive bias, resulting in highly accurate image analysis. Meanwhile, neoteric studies have reported a ``sparse double descent'' phenomenon that can occur in modern deep-learning models, where extremely over-parametrized models can generalize well. This raises practical questions about the optimal size of the model and the quest over finding the best trade-off between sparsity and performance is launched: are Vision Transformers also prone to sparse double descent? Can we find a way to avoid such a phenomenon? Our work tackles the occurrence of sparse double descent on ViTs. Despite some works that have shown that traditional architectures, like Resnet, are condemned to the sparse double descent phenomenon, for ViTs we observe that an optimally-tuned $ell_2$ regularization relieves such a phenomenon. However, everything comes at a cost: optimal lambda will sacrifice the potential compression of the ViT.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"413 1","pages":"490-502"},"PeriodicalIF":0.0,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74977615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Not with my name! Inferring artists' names of input strings employed by Diffusion Models 不能用我的名字!推断扩散模型使用的输入字符串的艺术家名称
R. Leotta, O. Giudice, Luca Guarnera, S. Battiato
{"title":"Not with my name! Inferring artists' names of input strings employed by Diffusion Models","authors":"R. Leotta, O. Giudice, Luca Guarnera, S. Battiato","doi":"10.48550/arXiv.2307.13527","DOIUrl":"https://doi.org/10.48550/arXiv.2307.13527","url":null,"abstract":"Diffusion Models (DM) are highly effective at generating realistic, high-quality images. However, these models lack creativity and merely compose outputs based on their training data, guided by a textual input provided at creation time. Is it acceptable to generate images reminiscent of an artist, employing his name as input? This imply that if the DM is able to replicate an artist's work then it was trained on some or all of his artworks thus violating copyright. In this paper, a preliminary study to infer the probability of use of an artist's name in the input string of a generated image is presented. To this aim we focused only on images generated by the famous DALL-E 2 and collected images (both original and generated) of five renowned artists. Finally, a dedicated Siamese Neural Network was employed to have a first kind of probability. Experimental results demonstrate that our approach is an optimal starting point and can be employed as a prior for predicting a complete input string of an investigated image. Dataset and code are available at: https://github.com/ictlab-unict/not-with-my-name .","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"7 1","pages":"364-375"},"PeriodicalIF":0.0,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74764905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle Components CarPatch:汽车零部件辐射场评价的综合基准
Davide Di Nucci, A. Simoni, Matteo Tomei, L. Ciuffreda, R. Vezzani, R. Cucchiara
{"title":"CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle Components","authors":"Davide Di Nucci, A. Simoni, Matteo Tomei, L. Ciuffreda, R. Vezzani, R. Cucchiara","doi":"10.48550/arXiv.2307.12718","DOIUrl":"https://doi.org/10.48550/arXiv.2307.12718","url":null,"abstract":"Neural Radiance Fields (NeRFs) have gained widespread recognition as a highly effective technique for representing 3D reconstructions of objects and scenes derived from sets of images. Despite their efficiency, NeRF models can pose challenges in certain scenarios such as vehicle inspection, where the lack of sufficient data or the presence of challenging elements (e.g. reflections) strongly impact the accuracy of the reconstruction. To this aim, we introduce CarPatch, a novel synthetic benchmark of vehicles. In addition to a set of images annotated with their intrinsic and extrinsic camera parameters, the corresponding depth maps and semantic segmentation masks have been generated for each view. Global and part-based metrics have been defined and used to evaluate, compare, and better characterize some state-of-the-art techniques. The dataset is publicly released at https://aimagelab.ing.unimore.it/go/carpatch and can be used as an evaluation guide and as a baseline for future work on this challenging topic.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"424 1","pages":"99-110"},"PeriodicalIF":0.0,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84940569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Video Anomaly Detection with Diffusion Models Conditioned on Compact Motion Representations 基于紧凑运动表征的扩散模型的无监督视频异常检测
Anil Osman Tur, Nicola Dall’Asen, C. Beyan, E. Ricci
{"title":"Unsupervised Video Anomaly Detection with Diffusion Models Conditioned on Compact Motion Representations","authors":"Anil Osman Tur, Nicola Dall’Asen, C. Beyan, E. Ricci","doi":"10.48550/arXiv.2307.01533","DOIUrl":"https://doi.org/10.48550/arXiv.2307.01533","url":null,"abstract":"This paper aims to address the unsupervised video anomaly detection (VAD) problem, which involves classifying each frame in a video as normal or abnormal, without any access to labels. To accomplish this, the proposed method employs conditional diffusion models, where the input data is the spatiotemporal features extracted from a pre-trained network, and the condition is the features extracted from compact motion representations that summarize a given video segment in terms of its motion and appearance. Our method utilizes a data-driven threshold and considers a high reconstruction error as an indicator of anomalous events. This study is the first to utilize compact motion representations for VAD and the experiments conducted on two large-scale VAD benchmarks demonstrate that they supply relevant information to the diffusion model, and consequently improve VAD performances w.r.t the prior art. Importantly, our method exhibits better generalization performance across different datasets, notably outperforming both the state-of-the-art and baseline methods. The code of our method is available at https://github.com/AnilOsmanTur/conditioned_video_anomaly_diffusion","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"1 1","pages":"49-62"},"PeriodicalIF":0.0,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73275631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Eye Diseases Classification Using Deep Learning 使用深度学习的眼病分类
Patrycja Haraburda, Lukasz Dabala
{"title":"Eye Diseases Classification Using Deep Learning","authors":"Patrycja Haraburda, Lukasz Dabala","doi":"10.1007/978-3-031-06427-2_14","DOIUrl":"https://doi.org/10.1007/978-3-031-06427-2_14","url":null,"abstract":"","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"192 1","pages":"160-172"},"PeriodicalIF":0.0,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77760385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Forecasting Future Instance Segmentation with Learned Optical Flow and Warping 利用学习光流和翘曲预测未来的实例分割
Andrea Ciamarra, Federico Becattini, Lorenzo Seidenari, A. Bimbo
{"title":"Forecasting Future Instance Segmentation with Learned Optical Flow and Warping","authors":"Andrea Ciamarra, Federico Becattini, Lorenzo Seidenari, A. Bimbo","doi":"10.1007/978-3-031-06433-3_30","DOIUrl":"https://doi.org/10.1007/978-3-031-06433-3_30","url":null,"abstract":"","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"45 1","pages":"349-361"},"PeriodicalIF":0.0,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85933274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Deep Autoencoders for Anomaly Detection in Textured Images Using CW-SSIM 基于CW-SSIM的纹理图像异常检测深度自编码器
Andrea Bionda, Luca Frittoli, G. Boracchi
{"title":"Deep Autoencoders for Anomaly Detection in Textured Images Using CW-SSIM","authors":"Andrea Bionda, Luca Frittoli, G. Boracchi","doi":"10.1007/978-3-031-06430-2_56","DOIUrl":"https://doi.org/10.1007/978-3-031-06430-2_56","url":null,"abstract":"","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"24 1","pages":"669-680"},"PeriodicalIF":0.0,"publicationDate":"2022-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90737615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
LDD: A Grape Diseases Dataset Detection and Instance Segmentation LDD:葡萄病害数据集检测与实例分割
L. Rossi, M. Valenti, S. Legler, A. Prati
{"title":"LDD: A Grape Diseases Dataset Detection and Instance Segmentation","authors":"L. Rossi, M. Valenti, S. Legler, A. Prati","doi":"10.1007/978-3-031-06430-2_32","DOIUrl":"https://doi.org/10.1007/978-3-031-06430-2_32","url":null,"abstract":"","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"279 1","pages":"383-393"},"PeriodicalIF":0.0,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72860473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Prediction of fish location by combining fisheries data and sea bottom temperature forecasting 结合渔业资料及海底温度预测鱼类位置
Matthieu Ospici, Klaas Sys, Sophie Guegan-Marat
{"title":"Prediction of fish location by combining fisheries data and sea bottom temperature forecasting","authors":"Matthieu Ospici, Klaas Sys, Sophie Guegan-Marat","doi":"10.48550/arXiv.2205.02107","DOIUrl":"https://doi.org/10.48550/arXiv.2205.02107","url":null,"abstract":"This paper combines fisheries dependent data and environmental data to be used in a machine learning pipeline to predict the spatio-temporal abundance of two species (plaice and sole) commonly caught by the Belgian fishery in the North Sea. By combining fisheries related features with environmental data, sea bottom temperature derived from remote sensing, a higher accuracy can be achieved. In a forecast setting, the predictive accuracy is further improved by predicting, using a recurrent deep neural network, the sea bottom temperature up to four days in advance instead of relying on the last previous temperature measurement.","PeriodicalId":74527,"journal":{"name":"Proceedings of the ... International Conference on Image Analysis and Processing. International Conference on Image Analysis and Processing","volume":"73 1","pages":"437-448"},"PeriodicalIF":0.0,"publicationDate":"2022-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88513638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信