Ching-yu Kao, Hongjia Wan, Karla Markert, Konstantin Böttinger
{"title":"Real or Fake? A Practical Method for Detecting Tempered Images","authors":"Ching-yu Kao, Hongjia Wan, Karla Markert, Konstantin Böttinger","doi":"10.1109/IPAS55744.2022.10052973","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052973","url":null,"abstract":"Tempering images has become technology that almost everyone can complete, including fake news, fake evidence presented in court, or forged documents. The main reason is because these editing tools, such as Photoshop, is simple to use, which is an urgent issue we need to solve. Hence, automatic tools helping to find manipulated images apart is critical for fighting misinformation campaigns. Here we propose and evaluate a neural network-based method. It can detect whether images have been artificially modified (classification), and further indicate the forged parts (segmentation). Our proposed method has better performance than most baseline methods. Last but not least, our method is not only effective on JPEG format, but can also be used on other formats.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132337365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel license plate detection based Time-To-Collision calculation for forward collision warning using Azure Kinect","authors":"Zhouyan Qiu, J. Martínez-Sánchez, P. Arias","doi":"10.1109/IPAS55744.2022.10053071","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053071","url":null,"abstract":"Forward Collision Warning (FCW) system constantly measures the relative position of the vehicle ahead and then predicts collisions. This paper proposes a new cost-effective and computationally efficient FCW method that uses a time-of-flight (ToF) camera to measure relevant distances to the front vehicle based on license plate detection. First, a Yolo V7 model is used to detect license plates to identify vehicles in front of the ego vehicle. Second, the distance between the front vehicle and the ego vehicle is determined by analyzing the captured depth map by the time-of-flight camera. In addition, the relative speed of the vehicle can be calculated by the direct distance change between the license plate and the camera between two consecutive frames. With a processing speed of 25–30 frames per second, the proposed FCW system is capable of determining relative distances and speeds within 26 meters in the real-time.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"409 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115935091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accurate Medicinal Plant Identification in Natural Environments by Embedding Mutual Information in a Convolution Neural Network Model","authors":"Lida Shahmiri, P. Wong, L. Dooley","doi":"10.1109/IPAS55744.2022.10053008","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053008","url":null,"abstract":"Medicinal plants are a primary source of disease treatment in many countries. As most are edible however, consumption of the wrong herbal plants can have serious consequences and even lead to death. Automatic accurate recognition of plant species to help users who do not have specialist knowledge of herbal plants is thus a desirable aim. Several automatic medicinal plant identification systems have been proposed, though most are significantly constrained either in the small number of species or in requiring manual image segmentation of plant leaves. This means they are captured on a plain background rather than being readily identified in their natural surroundings, which often involve complex and noisy backgrounds. While deep learning (DL) based methods have made considerable strides in recent times, their potential has not always been maximised because they are trained with samples which are not always fully representative of the intra-class and inter-class differences between the plant species concerned. This paper addresses this challenge by incorporating mutual information into a Convolutional Neural Network (CNN) model to select samples for the training, validation, and testing sets based on a similarity measure. A critical comparative evaluation of this new CNN medicinal plant classification model incorporating a mutual information guided training (MIGT) algorithm for sample selection, corroborates the superior classification performance achieved for the VNPlant-200 dataset, with an average accuracy of more than 97%, while the precision and recall values are also consistently above 97%. This is significantly better than existing CNN classification methods for this dataset as it crucially means false positive rates are substantially lower thus affording improved identification reliability.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124281898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A CNN Architecture for Detection and Segmentation of Colorectal Polyps from CCE Images","authors":"A. Tashk, Kasim E. Şahin, J. Herp, E. Nadimi","doi":"10.1109/IPAS55744.2022.10052795","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052795","url":null,"abstract":"Colon capsule endoscopy (CCE) as a novel 2D biomedical image modality based on visible light provides a higher perspective of the potential gastrointestinal lesions like polyps within the small and large intestines than the conventional colonoscopy. As the quality of images acquired via CCE imagery is low, so the artificial intelligence methods are proposed to help detect and localize polyps within an acceptable level of efficiency and performance. In this paper, a new deep neural network architecture known as AID-U-Net is proposed. AID-U-Net consists of two distinct types of paths: a) Two main contracting/expansive paths, and b) Two sub-contracting/expansive paths. The playing role of the main paths is to localize polyps as the target objectives in high resolution and multi-scale manner, while the two sub paths are responsible for preserving and conveying the information of low resolution and low-scale target objects. Furthermore, the proposed network architecture provides simplicity so that the model can be deployed for real time processing. AID-U-Net with an implementation of a VGG19 backbone shows better performance to detect polyps in CCE images in comparison with the other state-of-the-art U-Net models like conventional U-Net, U-Net++, and U-Net3+ with different pre-trained backbones like ImageNet, VGG19, ResNeXt50, Resnet50, InceptionV3 and InceptionResNetV2.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124844166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Razeen Hussain, Marianna Pizzo, Giorgio Ballestin, Manuela Chessa, F. Solari
{"title":"Experimental Validation of Photogrammetry based 3D Reconstruction Software","authors":"Razeen Hussain, Marianna Pizzo, Giorgio Ballestin, Manuela Chessa, F. Solari","doi":"10.1109/IPAS55744.2022.10053055","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053055","url":null,"abstract":"3D reconstruction is of interest to several fields. However, obtaining the 3D model is usually a time-consuming task that involves manual measurements and reproduction of the object using CAD software, which is not always feasible (e.g. for organic shapes). The necessity of quickly obtaining a dimensionally accurate 3D model of an object has led to the development of several reconstruction techniques, either vision based (with photogrammetry), using laser scanners, or a combination of the two. The contribution of this study is in the analysis of the performances of currently available 3D reconstruction frameworks with the aim of providing a guideline to novice users who may be unfamiliar with 3D reconstruction technologies. We evaluate various software packages on a synthetic dataset representing objects of various shapes and sizes. For comparison, we consider various metrics such as mean errors in the reconstructed cloud point and meshes and reconstruction time. Our results indicate that Colmap produces the best reconstruction.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122899133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vision Transformer for Automatic Student Engagement Estimation","authors":"Sandeep Mandia, Kuldeep Singh, R. Mitharwal","doi":"10.1109/IPAS55744.2022.10052945","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052945","url":null,"abstract":"Availability of the internet and quality of content attracted more learners to online platforms that are stimulated by COVID-19. Students of different cognitive capabilities join the learning process. However, it is challenging for the instructor to identify the level of comprehension of the individual learner, specifically when they waver in responding to feedback. The learner's facial expressions relate to content comprehension and engagement. This paper presents use of the vision transformer (ViT) to model automatic estimation of student engagement by learning the end-to-end features from facial images. The ViT architecture is used to enlarge the receptive field of the architecture by exploiting the multi-head attention operations. The model is trained using various loss functions to handle class imbalance. The ViT is evaluated on Dataset for Affective States in E-Environments (DAiSEE); it outperformed frame level baseline result by approximately 8% and the other two video level benchmarks by 8.78% and 2.78% achieving an overall accuracy of 55.18%. In addition, ViT with focal loss was also able to produce well distribution among classes except for one minority class.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115756216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"6D Pose Estimation for Precision Assembly","authors":"Ola Skeik, M. S. Erden, X. Kong","doi":"10.1109/IPAS55744.2022.10052989","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052989","url":null,"abstract":"The assembly of 3D products with complex geometry and material, such as a concentrator photovoltaics solar panel unit, is typically conducted manually. This results in low efficiency, precision and throughput. This study is motivated by an actual industrial need and targeted towards automation of the currently manual assembly process. By replacing the manual assembly with robotic assembly systems, the efficiency and throughput could be improved. Prior to assembly, it is essential to estimate the pose of the objects to be assembled with high precision. The choice of the machine vision is important and plays a critical role in the overall accuracy of such a complex task. Therefore, this work focuses on the 6D pose estimation for precision assembly utilizing a 3D vision sensor. The sensor we use is a 3D structured light scanner which can generate high quality point cloud data in addition to 2D images. A 6D pose estimation method is developed for an actual industrial solar-cell object, which is one of the four objects of an assembly unit of concentrator photovoltaics solar panel. The proposed approach is a hybrid approach where a mask R-CNN network is trained on our custom dataset and the trained model is utilized such that the predicted 2D bounding boxes are used for point cloud segmentation. Then, the iterative closest point algorithm is used to estimate the object's pose by matching the CAD model to the segmented object in point cloud.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122589274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Panthakkan, N. Valappil, S. Al-Mansoori, Hussain Al-Ahmad
{"title":"AI based Automatic Vehicle Detection from Unmanned Aerial Vehicles (UAV) using YOLOv5 Model","authors":"A. Panthakkan, N. Valappil, S. Al-Mansoori, Hussain Al-Ahmad","doi":"10.1109/IPAS55744.2022.10053056","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053056","url":null,"abstract":"Unmanned aerial vehicle (UAV) detection of moving vehicles is becoming into a significant study area in traffic control, surveillance, and military applications. The challenge arises in keeping minimal computational complexity allowing the system to be real-time as well. Applications of vehicle detection from UAVs include traffic parameter estimation, violation detection, number plate reading, and parking lot monitoring. The one stage detection model, YOLOv5 is used in this research work to develop a deep neural model-based vehicle detection system on highways from UAVs. In our system, several improvised strategies are put forth that are appropriate for small vehicle recognition under an aerial view angle which can accomplish real-time detection and high accuracy by incorporating an optimal pooling approach and dense topology method. Tilting the orientation of aerial photographs can improve the system's effectiveness. Metrics like hit rate, accuracy, and precision values are used to assess the performance of the proposed hybrid model, and performance is compared to that of other state-of-the-art algorithms.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128774642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georgios Makridis, Spyros Theodoropoulos, Dimitrios Dardanis, I. Makridis, Maria Margarita Separdani, G. Fatouros, D. Kyriazis, Panagiotis Koulouris
{"title":"XAI enhancing cyber defence against adversarial attacks in industrial applications","authors":"Georgios Makridis, Spyros Theodoropoulos, Dimitrios Dardanis, I. Makridis, Maria Margarita Separdani, G. Fatouros, D. Kyriazis, Panagiotis Koulouris","doi":"10.1109/IPAS55744.2022.10052858","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052858","url":null,"abstract":"In recent years there is a surge of interest in the interpretability and explainability of AI systems, which is largely motivated by the need for ensuring the transparency and accountability of Artificial Intelligence (AI) operations, as well as by the need to minimize the cost and consequences of poor decisions. Another challenge that needs to be mentioned is the Cyber security attacks against AI infrastructures in manufacturing environments. This study examines eXplainable AI (XAI)-enhanced approaches against adversarial attacks for optimizing Cyber defense methods in manufacturing image classification tasks. The examined XAI methods were applied to an image classification task providing some insightful results regarding the utility of Local Interpretable Model-agnostic Explanations (LIME), Saliency maps, and the Gradient-weighted Class Activation Mapping (Grad-Cam) as methods to fortify a dataset against gradient evasion attacks. To this end, we “attacked” the XAI-enhanced Images and used them as input to the classifier to measure their robustness of it. Given the analyzed dataset, our research indicates that LIME-masked images are more robust to adversarial attacks. We additionally propose an Encoder-Decoder schema that timely predicts (decodes) the masked images, setting the proposed approach sufficient for a real-life problem.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126457596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jian Zhang, Yufan Liu, Ao Li, Jinshan Zeng, Hongtu Xie
{"title":"Image Processing and Control of Tracking Intelligent Vehicle Based on Grayscale Camera","authors":"Jian Zhang, Yufan Liu, Ao Li, Jinshan Zeng, Hongtu Xie","doi":"10.1109/IPAS55744.2022.10053002","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053002","url":null,"abstract":"In order to realize the rapid and stable recognition and automatic tracking of various complex roads by the intelligent vehicles, this paper proposes image processing and cascade Proportion Integration Differentiation (PID) steering and speed control algorithms based on CMOS grayscale cameras in the context of the national college student intelligent vehicle competition. First, the grayscale image of the track is acquired by the grayscale camera. Then, the Otsu method is used to binarize the image, and the information of black boundary guide line is extracted. In order to improve the speed of the race, various track elements in the image are identified and classified, and the deviation between the actual centerline position and the ideal centerline position of the intelligent vehicle is calculated. Third, the discrete incremental cascade PID control algorithm is used to calculate the pulse width modulation (PWM) signal corresponding to the deviation. And the PWM signal is acted on the steering motor through the driving circuit, driving the intelligent vehicle to always drive along the middle road, so as to achieve the purpose of automatic tracking guidance. Experiments prove that the intelligent vehicle of this design can identify complex roads quickly and in a stable way, accurately complete automatic tracking, and obtain higher speed performance.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127826738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}