{"title":"Anomalous Driving Detection for Traffic Surveillance Video Analysis","authors":"Hang Shi, Hadi Ghahremannezhad, Chengjun Liu","doi":"10.1109/ist50367.2021.9651372","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651372","url":null,"abstract":"Traffic safety is an important topic in the intelligent transportation system. One major factor that causes traffic accident is anomalous driving. This paper presents a novel anomalous driving detection method in videos, which can detect unsafe anomalous driving behaviors. The contributions of this paper are three-fold. First, a new multiple object tracking (MOT) method is proposed to extract the velocities and trajectories of moving foreground objects in video. The new MOT method is a motion based tracking method, which integrates the temporal and spatial features. Second, a novel Gaussian local velocity (GLV) modeling method is presented to model the normal moving behavior in traffic videos. The GLV model is built for every location in the video frame, and updated online. Third, a discrimination function is proposed to detect anomalous driving behaviors. Experimental results using the real traffic data from the New Jersey Department of Transportation (NJDOT) show that our proposed method can perform anomalous driving detection fast and accurately.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115733393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Convolutional Neural Networks for Road Crack Detection: Qualitative and Quantitative Comparisons","authors":"Jiahe Fan, M. J. Bocus, Li Wang, Rui Fan","doi":"10.1109/ist50367.2021.9651375","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651375","url":null,"abstract":"Road crack detection is a crucial civil infrastructure inspection task. Road crack detection is generally performed by either certified inspectors or structural engineers. Nevertheless, this process is time-consuming and subjective. Deep convolutional neural networks (DCNNs) have demonstrated compelling results for image classification, but there are currently no comprehensive comparisons among them, in regard to road crack detection. Therefore, in this paper, we conduct extensive experiments to compare 30 state-of-the-art (SoTA) DCNNs for road crack detection: Each DCNN is trained on a training set; The best performing models are selected on the validation set; Their performance is further quantified on a test set with respect to six evaluation metrics: precision, recall, accuracy, F-score, area under receiver operating characteristic (AUROC), and runtime. The experimental results suggest that road crack detection is a relatively easy image classification task. All the SoTA DCNNs perform similarly. The DCNNs evaluated in this study also achieved very similar performance when only a small amount of training data is available. Furthermore, PNASNet achieved the best trade-off between speed and accuracy, and thus, it is more practical to be used for real-time and robust road crack detection. Moreover, it was found that the best DCNN models did not generalize well when tested on new unseen data sets consisting of images not specifically related to road cracks.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125910643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of Data Noise on LSTM-FC Scattered-Field Processing for Microwave Imaging","authors":"A. Fedeli, M. Pastorino, A. Randazzo","doi":"10.1109/ist50367.2021.9651450","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651450","url":null,"abstract":"A strategy to mitigate model error in microwave imaging by introducing a neural-network-based preprocessing of the scattered field is considered in this paper. In particular, the approach consists of a long short-term memory (LSTM) cell combined with fully-connected (FC) neural layers. Such a network, which works in the time domain, aims at extracting the scattered field contributions as they were measured by a canonical two-dimensional setup with line-source antennas and ideal probing elements. The extracted data are then given in input to a quantitative tomographic technique formulated in the mathematical context of Lebesgue spaces with variable exponents. Here, the effect of input data noise on the whole imaging procedure is evaluated for the first time. Results obtained on simulated data involving circular dielectric cylinders are presented to assess the processing error and the imaging performance against the signal-to-noise ratio.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122524484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Role of Machine Vision in Industry 4.0: an automotive manufacturing perspective","authors":"F. Konstantinidis, S. Mouroutsos, A. Gasteratos","doi":"10.1109/ist50367.2021.9651453","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651453","url":null,"abstract":"The challenging market of automotive industry urges the manufacturers worldwide to benefit from the incredible technological advancements of the fourth industrial revolution (Industry 4.0), which is in progress. This revolutionary era flourishes the sensing, processing and integration technologies across the systems, while the machine vision serves as the “eyes” of the cyber-physical systems. This paper systematically provides a comprehensive analysis of the applied machine vision systems in the automotive industry over the last five years and anticipates the technology opportunities and future trends. The conducted analysis reveals that the machine vision technology is mainly employed for quality related purposes. Besides, fruitful advancements have occurred in the other automotive manufacturing domains, yet the horizontal and vertical integration is not a priority in their design. The authors conclude that computer vision systems empowered with self-adjustment capacities should be integrated with existing execution systems to rectify defects in real-time, thus promoting the intelligent system design towards enabling the zero defect manufacturing in human and system level.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129921331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Monocularly Generated 3D High Level Semantic Model by Integrating Deep Learning Models and Traditional Vision Techniques","authors":"Steven Alsheimer, Zhigang Zhu","doi":"10.1109/ist50367.2021.9651471","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651471","url":null,"abstract":"Scene reconstruction using Monodepth2 (Monocular Depth Inference) which provides depth maps from a single RGB camera, the outputs are filled with noise and inconsistencies. Instance segmentation using a Mask R-CNN (Region Based Convolution Neural Networks) deep model can provide object segmentation results in 2D but lacks 3D information. In this paper we propose to integrate the results of Instance segmentation via Mask R-CNN’s, CAD model Car Shape Alignment, and depth from Monodepth2 together with classical dynamic vision techniques to create a High-level Semantic Model with separability, robustness, consistency and saliency. The model is useful for both virtualized rendering, semantic augmented reality and automatic driving. Experimental results are provided to validate the approach.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128778878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anton Vogiatzis, G. Chalkiadakis, K. Moirogiorgou, G. Livanos, M. Papadogiorgaki, M. Zervakis
{"title":"Dual-Branch CNN for the Identification of Recyclable Materials","authors":"Anton Vogiatzis, G. Chalkiadakis, K. Moirogiorgou, G. Livanos, M. Papadogiorgaki, M. Zervakis","doi":"10.1109/ist50367.2021.9651347","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651347","url":null,"abstract":"The classification of recyclable materials, and in particular the recovery of plastic, plays an important role in the economy, but also in environmental sustainability. This study presents a novel image classification model that can be efficiently used to distinguish recyclable materials. Building on recent work in deep learning and waste classification, we introduce the so-called “Dual-branch Multi-output CNN”, a custom convolutional neural network composed of two branches aimed to i) classify recyclables and ii) distinguish the type of plastic. The proposed architecture is composed of two classifiers trained on two different datasets, so as to encode complementary attributes of the recyclable materials. In our work, the Densenet121, ResNet50 and VGG16 architectures were used on the Trashnet dataset, along with data augmentation techniques, as well as on the WaDaBa dataset with physical variation techniques. In particular, our approach makes use of the joint utilization of the datasets, allowing the learning of disjoint label combinations. Our experiments confirm its effectiveness in the classification of waste material.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130201776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extraction of Laryngeal Cancer Informative Frames from Narrow Band Endoscopic Videos","authors":"Noha A. Sobhi, S. Youssef, Marwa ElShenawy","doi":"10.1109/ist50367.2021.9651343","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651343","url":null,"abstract":"Laryngeal cancer is one of the most common types of throat cancer. One of the most powerful technologies is a Narrow-band imaging (NBI) endoscope, which helps in diagnosing the early stage of cancer and reducing the biopsy risks. However, reviewing an endoscopic video is a labor-intensive process, as it contains a large number of uninformative frames due to the illumination or reflection effect and the appearance of saliva. This paper aims at designing and implementing an enhanced automated model to select the informative Laryngoscope video frames to reduce the computational time of scanning all the frames. Also, the selection of the informative frame will help the specialist in the diagnosis process. The proposed model uses a set of quality assessment features including texture and color features. Texture features have been used to detect the sharpness of the image as it is an important measure of image clarity. Moreover, the color features will help in identifying the images with saliva or specular reflections. Then, the extracted features are fed to different classifiers such as Support Vector Machine (SVM) and different ensemble classifiers. The classifiers have been used to classify the video frame into one of four types (informative (I), blurred (B), saliva or specular reflections (S), and underexposed (U)). The experimental results show that the Random Forest (RF) classifier produced a very promising classification result, with an average classification recall equals to 95.8%. The proposed model obtained better classification recall by 2.2% compared to the existing state-of-the-art method.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133873674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Single frontal projection 2D/3D switchable display based on integral imaging","authors":"Qiang Li, F. Zhong, Wei He, Huan Deng","doi":"10.1109/ist50367.2021.9651439","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651439","url":null,"abstract":"In this paper, we propose a single frontal projection 2D/3D switchable display by using a polarization-dependent liquid crystal lens array (LCLA) and a polymer dispersed liquid crystal (PDLC). By simply controlling the working state of the PDLC, the 2D/3D switchable display can be achieved. When the PDLC with an applied voltage is in the transparent state, the 3D images are reconstructed by modulating the light field twice using polarization-dependent LCLA. When the PDLC without an applied voltage is in the scattering state, the system can present 2D images directly. We experimentally demonstrate a prototype of the proposed 2D/3D switchable display. The results show that our proposed system can present 2D and 3D images with high image quality and the response time of switching between different modes are 200 ms. Moreover, the proposed system is compact and simple, which is suitable for the application in cinema and home theatre.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124395730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPU-accelerated Localization in Confined Spaces using Deep Geometric Features","authors":"R. Brogaard, Ole Ravn, Evangelos Boukas","doi":"10.1109/ist50367.2021.9651425","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651425","url":null,"abstract":"Navigating within dark and confined spaces require robotic platforms to utilize accurate and reliable localization systems to operate safely and unattended. This paper presents an absolute localization system, for known confined spaces, using state of the art 3D pointcloud descriptors. Local geometric features are extracted from a known map and registered to matching features visible in the robots field of view. The 3D registrations are motion-filtered and fused with a visual inertial odometry estimate in an Extended Kalman filter, which return a fast and accurate absolute pose estimate. The proposed localization system is tested with different deep learning feature descriptors in a structured confined space, and our results indicate greater accuracy and lower processing time when compared to mainstream 3D registration approaches.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"56 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123494523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Saif ur Rahman, A. Mustapha, M. Abou-Khousa
{"title":"Detection and Sizing of Surface Cracks in Metals Using UHF Probe","authors":"Mohammed Saif ur Rahman, A. Mustapha, M. Abou-Khousa","doi":"10.1109/ist50367.2021.9651442","DOIUrl":"https://doi.org/10.1109/ist50367.2021.9651442","url":null,"abstract":"Crack formation on metal surfaces is a common industry problem that has catastrophic consequences if left undetected. While a variety of microwave and millimeter wave probes have been employed for crack detection, the utility of a ultra-high frequency (UHF) probe that operates at a much lower frequency when compared to other probes and provides enhanced resolution is explored in this paper. To benchmark the performance of the probe towards crack sizing, the response is compared qualitatively and quantitatively to a standard Ka-band open-ended rectangular waveguide and a V-band reflectometer. It is elucidated through line scans over a metal sample with cracks of various widths ranging from 0.25 mm to 1.75 mm. It is shown that the probe provides comparable indications of the cracks and has better signal to noise ratio (SNR) than the Ka-band waveguide. Furthermore, a magnitude image of the sample is also included to highlight the effectiveness of the probe towards providing a high-resolution image of the sample at a much lower frequency.","PeriodicalId":433402,"journal":{"name":"2021 IEEE International Conference on Imaging Systems and Techniques (IST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128720428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}