Anatol Maier, Denise Moussa, A. Spruck, Jürgen Seiler, C. Riess
{"title":"Reliability Scoring for the Recognition of Degraded License Plates*","authors":"Anatol Maier, Denise Moussa, A. Spruck, Jürgen Seiler, C. Riess","doi":"10.1109/AVSS56176.2022.9959390","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959390","url":null,"abstract":"Criminal investigations oftentimes need the identification of license plates of escape vehicles. The vehicles may be recorded by low-quality cameras in the wild. Their license plates may be unreadable for police officers. Recent efforts aim to use machine learning to forensically decipher license plates from such low-quality images. These methods operate near the information-theoretic limit of recognition and hence show quite high error rates. Unfortunately, it is unclear when such prediction errors occur, which makes it difficult to use these methods in practice. In this work, we propose a Bayesian Neural Network to inherently incorporate a reliability measure into the classifier. We additionally propose to integrate multiple estimations with an entropy weight to further improve the reliability. Our experiments show that this uncertainty metric dramatically reduces the number of false predictions while preserving most of the true predictions.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"347 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124277581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael O'Byrne, M. Sugrue, Kinesense Ltd, Anil Vibhoothi, Kokaram
{"title":"Impact of Video Compression on the Performance of Object Detection Systems for Surveillance Applications","authors":"Michael O'Byrne, M. Sugrue, Kinesense Ltd, Anil Vibhoothi, Kokaram","doi":"10.1109/AVSS56176.2022.9959476","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959476","url":null,"abstract":"This study examines the relationship between H.264 video compression and the performance of an object detection network (YOLOv5). We curated a set of 50 surveillance videos and annotated targets of interest (people, bikes, and vehicles). Videos were encoded at 5 quality levels using Constant Rate Factor (CRF) values in the set {22,32,37,42,47}. YOLOv5 was applied to compressed videos and detection performance was analyzed at each CRF level. Test results indicate that the detection performance is generally robust to moderate levels of compression; using a CRF value of 37 instead of 22 leads to significantly reduced bitrates/file sizes without adversely affecting detection performance. However, detection performance degrades appreciably at higher compression levels, especially in complex scenes with poor lighting and fast-moving targets. Finally, retraining YOLOv5 on compressed imagery gives up to a 1% improvement in F1 score when applied to highly compressed footage.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121636184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yassine Naji, Aleksandr Setkov, Angélique Loesch, M. Gouiffès, Romaric Audigier
{"title":"Spatio-temporal predictive tasks for abnormal event detection in videos","authors":"Yassine Naji, Aleksandr Setkov, Angélique Loesch, M. Gouiffès, Romaric Audigier","doi":"10.1109/AVSS56176.2022.9959669","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959669","url":null,"abstract":"Abnormal event detection in videos is a challenging problem, partly due to the multiplicity of abnormal patterns and the lack of their corresponding annotations. In this paper, we propose new constrained pretext tasks to learn object level normality patterns. Our approach consists in learning a mapping between down-scaled visual queries and their corresponding normal appearance and motion characteristics at the original resolution. The proposed tasks are more challenging than reconstruction and future frame prediction tasks which are widely used in the literature, since our model learns to jointly predict spatial and temporal features rather than reconstructing them. We believe that more constrained pretext tasks induce a better learning of normality patterns. Experiments on several benchmark datasets demonstrate the effectiveness of our approach to localize and track anomalies as it outperforms or reaches the current state-of-the-art on spatio-temporal evaluation metrics.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122667503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Refining Action Boundaries for One-stage Detection","authors":"Hanyuan Wang, M. Mirmehdi, D. Damen, Toby Perrett","doi":"10.1109/AVSS56176.2022.9959554","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959554","url":null,"abstract":"Current one-stage action detection methods, which simultaneously predict action boundaries and the corresponding class, do not estimate or use a measure of confidence in their boundary predictions, which can lead to inaccurate boundaries. We incorporate the estimation of boundary confidence into one-stage anchor-free detection, through an additional prediction head that predicts the refined boundaries with higher confidence. We obtain state-of-the-art performance on the challenging EPICKITCHENS-100 action detection as well as the standard THUMOS14 action detection benchmarks, and achieve improvement on the ActivityNet-1.3 benchmark.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115613835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real Image Super-Resolution using GAN through modeling of LR and HR process","authors":"Rao Muhammad Umer, C. Micheloni","doi":"10.1109/AVSS56176.2022.9959415","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959415","url":null,"abstract":"The current existing deep image super-resolution methods usually assume that a Low Resolution (LR) image is bicubicly downscaled of a High Resolution (HR) image. However, such an ideal bicubic downsampling process is different from the real LR degradations, which usually come from complicated combinations of different degradation processes, such as camera blur, sensor noise, sharpening artifacts, JPEG compression, and further image editing, and several times image transmission over the internet and unpredictable noises. It leads to the highly ill-posed nature of the inverse upscaling problem. To address these issues, we propose a GAN-based SR approach with learnable adaptive sinusoidal nonlinearities incorporated in LR and SR models by directly learn degradation distributions and then synthesize paired LR/HR training data to train the generalized SR model to real image degradations. We demonstrate the effectiveness of our proposed approach in quantitative and qualitative experiments.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121473302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mertcan Cokbas, John Bolognino, J. Konrad, P. Ishwar
{"title":"FRIDA: Fisheye Re-Identification Dataset with Annotations","authors":"Mertcan Cokbas, John Bolognino, J. Konrad, P. Ishwar","doi":"10.1109/AVSS56176.2022.9959697","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959697","url":null,"abstract":"Person re-identification (PRID) from side-mounted rectilinear-lens cameras is a well-studied problem. On the other hand, PRID from overhead fisheye cameras is new and largely unstudied, primarily due to the lack of suitable image datasets. To fill this void, we introduce the “Fisheye Re-IDentification Dataset with Annotations” (FRIDA)1, with 240k+ bounding-box annotations of people, captured by 3 time-synchronized, ceiling-mounted fisheye cameras in a large indoor space. Due to a field-of-view overlap, PRID in this case differs from a typical PRID problem, which we discuss in depth. We also evaluate the performance of 10 state-of-the-art PRID algorithms on FRIDA. We show that for 6 CNN-based algorithms, training on FRIDA boosts the performance by up to 11.64% points in mAP compared to training on a common rectilinear-camera PRID dataset.1vip. bu.edu/frida","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122034387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Background Subtraction by Generative Neural Networks","authors":"Fateme Bahri, Nilanjan Ray","doi":"10.1109/AVSS56176.2022.9959543","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959543","url":null,"abstract":"Background subtraction is a significant task in computer vision and an essential step for many real world applications. One of the challenges for background subtraction methods is dynamic background, which constitutes stochastic movements in some parts of the background. In this paper, we have proposed a new background subtraction method, called DBSGen, which uses two generative neural networks, one for dynamic motion removal and another for background generation. At the end, the foreground moving objects are obtained by a pixel-wise distance threshold based on a dynamic entropy map. DBSGen is an end-to-end, unsupervised optimization method with a near real-time frame rate. The performance of the method is evaluated over dynamic background sequences and it outperforms most of state-of-the-art unsupervised methods. Our code is publicly available at https://github.com/FatemeBahri/DBSGen.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123382956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hand-Object Interaction Reasoning","authors":"Jian Ma, D. Damen","doi":"10.1109/AVSS56176.2022.9959207","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959207","url":null,"abstract":"This paper proposes an interaction reasoning network for modelling spatio-temporal relationships between hands and objects in egocentric video. The proposed interaction unit utilises a Transformer-style module to reason about each acting hand, and its spatio-temporal relations to the other hand as well as objects being interacted with. We show that modelling two-handed interactions are critical for action recognition in egocentric video, and demonstrate that by using positionally-encoded trajectories, the network can better recognise observed interactions. We train and evaluate our proposed network on large-scale egocentric EPIC-KITCHENS-100 and crowd-sourced Something-Else datasets, with an ablation study to showcase our proposal.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133553002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}