{"title":"License Plate Detection and Recognition System for All Types of Bangladeshi Vehicles Using Multi-step Deep Learning Model","authors":"Homaira Huda Shomee, Ataher Sams","doi":"10.1109/DICTA52665.2021.9647284","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647284","url":null,"abstract":"A robust license plate (LP) detection and recognition system can extract the license plate information from a still image or video of a moving or stationary vehicle. Bangla license plate recognition is a complicated subject of study due to no publicly available dataset and its specific characteristics with over 100 unique classes, including words, letters, and digits. This paper proposes a robust multi-step deep learning system based on You Only Look Once (YOLO) architecture that can extract license plate information from a real-world image. The resulting system localizes license plates using YOLOv4 object detector model, automatically crops the license plates using bounding box coordinates, enhances the extracted license plate image quality using Enhanced Super Resolution Generative Adversarial Networks (ESRGAN), and then recognizes the classes using YOLOv4 without segmenting the characters. Synthetic images have been used to make proposed method capable of recognizing the classes in unfavorable and complicated conditions. A complete two-part dataset named ‘Bangla LPDB-A’ is created in this study. This dataset includes Bangladeshi vehicle images with manually annotated license plates and cropped license plates with manually annotated words, letters, and digits. The proposed system is tested on this dataset that has achieved mean average precision (mAP) of 98.35% and 98.09% for final detection and recognition model, which has an average prediction time of 23 ms and 35 ms.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126051386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zaid Ilyas, Naeha Sharif, J. Schousboe, J. Lewis, D. Suter, S. Z. Gilani
{"title":"GuideNet: Learning Inter- Vertebral Guides in DXA Lateral Spine Images","authors":"Zaid Ilyas, Naeha Sharif, J. Schousboe, J. Lewis, D. Suter, S. Z. Gilani","doi":"10.1109/DICTA52665.2021.9647067","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647067","url":null,"abstract":"Cardiovascular Disease (CVD) is the leading cause of death worldwide. Calcification in the Abdominal Aorta is a stable marker of CVD development and, hence, it's early detection is considered crucial to saving lives. Imaging techniques such as Computed Tomography (CT) and Digital X-Ray Imaging can be used to accurately predict and localize Abdominal Aortic Calcification (AAC), however, these methods are not only expensive but also expose the patients to high ionizing radiation. In contrast, Dual Energy X-ray Absorptiometry (DXA) is an efficient, cost-effective and low radiation exposure-based imaging alternative, but with challenges like low resolution and vague vertebral boundaries. This poses a bottleneck in identifying the vertebrae and their boundaries which is crucial in manual as well as automatic scoring of AAC from DXA scans. In this paper, we address this research gap by proposing a framework which first localizes the vertebrae T12, L1, L2, L3, L4 and L5 and then generates Inter-Vertebral Guides (IVGs) between them. Our deep model is trained on lateral view DXA spine images and shows promising results in generating IVGs with high accuracy, which we believe can greatly reduce inter-observer variability in AAC scoring in DXA imaging domain.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"3 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123665711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building Boundary Extraction from LiDAR Point Cloud Data","authors":"E. Dey, M. Awrangjeb, F. T. Kurdi, Bela Stantic","doi":"10.1109/DICTA52665.2021.9647371","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647371","url":null,"abstract":"Building boundary extraction from LiDAR point cloud data is important for urban planning and 3D modelling. Due to the uneven point distribution, missing data, and occlusion in LiDAR point cloud data, extraction of boundary points is challenging. Existing approaches have shortcomings either in detecting boundary points on concave shapes or separate identification of ‘hole’ boundary points inside the building roof. This paper, presents a method for detecting both inner and outer boundary points of the extracted building point cloud. Based on the properties of Delaunay Triangulation and distance from the mean point of the calculated neighbourhood for any point, we extract both inner and outer boundary points. Experimental results using some synthetic shapes as well as some real datasets show the competitive performance of the proposed method.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122107816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Class-wise Forgetting Detector in Continual Learning","authors":"Xuan Cuong Pham, Alan Wee-Chung Liew, Can Wang","doi":"10.1109/DICTA52665.2021.9647137","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647137","url":null,"abstract":"Deep learning model suffers from catastrophic forgetting when learning continuously from stream data. Existing strategies for continual learning suppose the forgetting always happens when learning a new task and only deals with the previous task's global forgetting. This study introduces a novel active forgetting detector based on a windowing technique that monitors the model's forgetting rate for each encountered class label. When the model experiences the forgetting issue, we adapt the forgetting classes by using a proposed replay from experience method called online triplet rehearsal. We conduct comprehensive experiments on four vision datasets to demonstrate that the proposed approach performs significantly better than three state-of-the-art continual learning methods.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128531600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining Data Augmentation and Domain Distance Minimisation to Reduce Domain Generalisation Error","authors":"Hoang Son Le, Rini Akmeliawati, G. Carneiro","doi":"10.1109/DICTA52665.2021.9647203","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647203","url":null,"abstract":"Domain generalisation represents the challenging problem of using multiple training domains to learn a model that can generalise to previously unseen target domains. Recent papers have proposed using data augmentation to produce realistic adversarial examples to simulate domain shift. Under current domain adaptation/generalisation theory, it is unclear whether training with data augmentation alone is sufficient to improve domain generalisation results. We propose an extension of the current domain generalisation theoretical framework and a new method that combines data augmentation and domain distance minimisation to reduce the upper bound on domain generalisation error. Empirically, our algorithm produces competitive results when compared with the state-of-the-art methods in the domain generalisation benchmark PACS. We have also performed an ablation study of the technique on a real-world chest x-ray dataset, consisting of a subset of CheXpert, Chest14, and PadChest datasets. The result shows that the proposed method works best when the augmented domains are realistic, but it can perform robustly even when domain augmentation fails to produce realistic samples.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124560702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Koloud N. Alkhamaiseh, J. Grantner, Saad A. Shebrain, I. Abdel-Qader
{"title":"Towards Automated Performance Assessment for Laparoscopic Box Trainer using Cross-Stage Partial Network","authors":"Koloud N. Alkhamaiseh, J. Grantner, Saad A. Shebrain, I. Abdel-Qader","doi":"10.1109/DICTA52665.2021.9647393","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647393","url":null,"abstract":"Recent advances in laparoscopic surgery have increased the need to improve surgical resident training and feedback by incorporating simulator based training in traditional training programs. However, the current training methods still require the presence of an expert surgeon to assess the surgical dexterity of the trainee. This process is time consuming and may lead to subjective assessment. This research aims to extend the application of object detection in laparoscopy training by tracking tool motion, surgical object detection and tracking. YOLOv5 and scaled-YOLOv4 object detection neural networks based on cross-stage partial network (CSP) are trained and tested on the Fundamentals of Laparoscopic Surgery (FLS) pattern cutting exercise in a box trainer. Experiments show that Scaled-YOLOv4 have a mAP score of 98.9, 79.5 precision and 98.9 recall for bounding boxes on a limited training dataset. This research clearly demonstrates the potential of using CSP networks in automated tool motion analysis for the assessment of the resident's performance during training.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129076384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SimilarityGAN: Using Similarity to Loosen Structural Constraints in Generative Adversarial Models","authors":"Edward Collier, S. Mukhopadhyay","doi":"10.1109/DICTA52665.2021.9647086","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647086","url":null,"abstract":"Recently, generative adversarial networks have performed extremely well in image translation. When translating images current models adhere to a strict structural symmetry between the input and output images. This paper, presents a technique for image translation involving a pair of image domains that allows the output image to go beyond the structural symmetry constraints imposed by the input. By using a siamese model as the discriminator, we condition the generator to produce images that are only similar, rather than identical to the input. We show experimentally that using this modified loss a generator can generate realistic images for complex problems that only loosely adhere to the structure of the input.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116143652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Resource-Constrained Human Presence Detection for Indirect Time-of-Flight Sensors","authors":"Caterina Nahler, H. Plank, C. Steger, N. Druml","doi":"10.1109/DICTA52665.2021.9647286","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647286","url":null,"abstract":"Face recognition with LiDAR and Time-of-Flight sensors is increasingly used in smartphones to automatically unlock devices when the user is present. The problem is that the presence of the user has to be detected before the energy-intensive face recognition can be started. This work presents a solution by introducing an energy-efficient measurement method for indirect Time-of-Flight sensors, which includes an on-chip processing method to detect the user's presence directly on a Time-of-Flight image sensor. The presented method is based on histogram analysis of direct reflectance images. The unique use of direct reflectance images is a significant enabler as it reduces histogram variations. Therefore our method outperforms conventional histogram detection methods and enables low-power on-chip face detection. It further enables the Time-of-Flight sensor to notify the smartphone application processor to perform the actual face recognition for device unlocking. We show in our evaluation that our method just has a false-positive rate of 17%, while all faces were detected. This is very promising since our measurement method reduces the processing cost which might lead to vast power savings in a future smartphone generation.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116512675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Human Skeleton Joint Dynamics for Fall Detection","authors":"Sania Zahan, G. Hassan, A. Mian","doi":"10.1109/DICTA52665.2021.9647270","DOIUrl":"https://doi.org/10.1109/DICTA52665.2021.9647270","url":null,"abstract":"The increasing pace of population aging calls for better care and support systems. Falling is a frequent and critical problem for elderly people causing serious long-term health issues. Fall detection from video streams is not an attractive option for real-life applications due to privacy issues. Existing methods try to resolve this issue by using very low-resolution cameras or video encryption. However, privacy cannot be ensured completely with such approaches. Key points on the body, such as skeleton joints, can convey significant information about motion dynamics and successive posture changes which are crucial for fall detection. Skeleton joints have been explored for feature extraction but with image recognition models that ignore joint dependency across frames which is important for the classification of actions. Moreover, existing models are over-parameterized or evaluated on small datasets with very few activity classes. We propose an efficient graph convolution network model that exploits spatio-temporal joint dependencies and dynamics of human skeleton joints for accurate fall detection. Our method leverages dynamic representation with robust concurrent spatiotemporal characteristics of skeleton joints. We performed extensive experiments on three large-scale datasets. With a significantly smaller model size than most existing methods, our proposed method achieves state-of-the-art results on the large scale NTU datasets.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130481147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}