Nikolaos Giakoumoglou, E. Pechlivani, N. Katsoulas, D. Tzovaras
{"title":"White Flies and Black Aphids Detection in Field Vegetable Crops using Deep Learning","authors":"Nikolaos Giakoumoglou, E. Pechlivani, N. Katsoulas, D. Tzovaras","doi":"10.1109/IPAS55744.2022.10052855","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052855","url":null,"abstract":"Digital image processing for the early detection of plant pests as insects in vegetable crops is essential for plant's yield and quality. In recent years, deep learning has made strides in the digital image processing, opening up new possibilities for pest monitoring. In this paper, state-of-the-art deep learning models are presented to detect common insect pests in vegetable cultivation named whiteflies and black aphids. Due to the absence of data sources addressing the aforementioned insect pests, adhesive traps for catching the target insects were used for the creation of an annotated image dataset. In total 225 images were collected, and 5904 insect instances were labelled by expert agronomists. This dataset faces many challenges such as the tiny size of objects, occlusions and resemblance. Object detection models were used like YOLOv3, YOLOv5, Faster R-CNN, Mask R-CNN, and RetinaNet as baseline algorithms for benchmark experiments. For achieving accurate results, data augmentation was used. This study has addressed these challenges by applying deep learning models which are able to deal with tiny object detection ascribed to very small insect size. The experiment results exhibit a mean Average Precision (mAP) of 75%. Dataset is available for download at https://zenodo.org/record/7139220","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124411878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sohail Anwar, Abdul Rahim Kolachi, Shadi Khan Baloch, Shoaib R. Soomro
{"title":"Bacterial Blight and Cotton Leaf Curl Virus Detection Using Inception V4 Based CNN Model for Cotton Crops","authors":"Sohail Anwar, Abdul Rahim Kolachi, Shadi Khan Baloch, Shoaib R. Soomro","doi":"10.1109/IPAS55744.2022.10052835","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052835","url":null,"abstract":"Agriculture sector is an important pillar of the global economy. The cotton crop is considered one of the prominent agricultural resources. It is widely cultivated in India, China, Pakistan, USA, Brazil, and other countries of the world. The worldwide cotton crop production is severely affected by numerous diseases such as cotton leaf curl virus (CLCV/CLCuV), bacterial blight, and ball rot. Image processing techniques together with machine learning algorithms are successfully employed in numerous fields and have also used for crop disease detection. In this study, we present a deep learning-based method for classifying diseases of the cotton crop, including bacterial blight and cotton leaf curl virus (CLCV). The dataset of cotton leaves showing disease symptoms is collected from various locations in Sindh, Pakistan. We employ the Inception v4 architecture as a convolutional neural network to identify diseased plant leaves in particular bacterial blight and CLCV. The accuracy of the designed model is 98.26% which shows prominent improvement compared to the existing models and systems.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121042270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tonjih Tazalli, Zarin Anan Aunshu, Sumaya Sadbeen Liya, Magfirah Hossain, Zareen Mehjabeen, M. Ahmed, Muhammad Iqbal Hossain
{"title":"Computer Vision-Based Bengali Sign Language To Text Generation","authors":"Tonjih Tazalli, Zarin Anan Aunshu, Sumaya Sadbeen Liya, Magfirah Hossain, Zareen Mehjabeen, M. Ahmed, Muhammad Iqbal Hossain","doi":"10.1109/IPAS55744.2022.10052928","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052928","url":null,"abstract":"In the whole world, around 7% of people have hearing and speech impairment problems. They use sign language as their communication method. As for our country, there are lots of people born with hearing and speech impairment problems. Therefore, our primary focus is to work for those people by converting Bangla sign language into text. There are already various projects on Bangla sign language done by other people. However, they focused more on the separate alphabets and numerical numbers. That is why, we want to concentrate on Bangla word signs since communication is done using words or phrases rather than alphabets. There is no proper database for Bangla word sign language, so we want to make a database for our work using BDSL. In recognition of sign language (SLR), there usually are two types of scenarios: isolated SLR, which takes words by word and completes recognize action, and the other one is continuous SLR, which completes action by translating the whole sentence at once. We are working on isolated SLR. We introduce a method where we are going to use PyTorch and YOLOv5 for a video classification model to convert Bangla sign language into the text from the video where each video has only one sign language word. Here, we have achieved an accuracy rate of 76.29% on the training dataset and 51.44% on the testing dataset. We are working to build a system that will make it easier for hearing and speech-disabled people to interact with the general public.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125305663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdul Rehman, Gaëtan Le Guelvouit, J. Dion, F. Guilloud, M. Arzel
{"title":"DWT Collusion Resistant Video Watermarking Using Tardos Family Codes","authors":"Abdul Rehman, Gaëtan Le Guelvouit, J. Dion, F. Guilloud, M. Arzel","doi":"10.1109/IPAS55744.2022.10053023","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053023","url":null,"abstract":"A fingerprinting process is an efficient means of protecting multimedia content and preventing illegal distribution. The goal is to find individuals who were engaged in the production and illicit distribution of a multimedia product. We investigated discrete wavelet transform (DWT) based blind video watermarking strategy tied with probabilistic fingerprinting codes to avoid collusion among higher-resolution videos. We used FFmpeg to run a variety of collusion attacks (e.g., averaging, darkening, and lighten) on high resolution video and compared the most often suggested code generator and decoders in the literature to find at least one colluder within the necessary code length. The Laarhoven codes generator and nearest neighbor search (NNS) decoder outperforms all other suggested generators and decoders in the literature in terms of computational time, colluder detection and resources.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126933765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Society Infrormation","authors":"","doi":"10.1109/ipas55744.2022.10052899","DOIUrl":"https://doi.org/10.1109/ipas55744.2022.10052899","url":null,"abstract":"","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116726229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Portscher, Sebastian Stabinger, A. Rodríguez-Sánchez
{"title":"Evaluating Attention in Convolutional Neural Networks for Blended Images","authors":"Andrea Portscher, Sebastian Stabinger, A. Rodríguez-Sánchez","doi":"10.1109/IPAS55744.2022.10052853","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052853","url":null,"abstract":"In neuroscientific experiments, blended images are used to examine how attention mechanisms in the human brain work. They are particularly suited for this research area, as a subject needs to focus on particular features in an image to be able to classify superimposed objects. As Convolutional Neural Networks (CNNs) take some inspiration from the mammalian visual system – such as the hierarchical structure where different levels of abstraction are processed on different network layers – we examine how CNNs perform on this task. More specifically, we evaluate the performance of four popular CNN architectures (ResNet18, ResNet50, CORnet-Z, and Inception V3) on the classification of objects in blended images. Since humans can rather easily solve this task by applying object-based attention, we also augment all architectures with a multi-headed self-attention mechanism to examine its effect on performance. Lastly, we analyse if there is a correlation between the similarity of a network architecture's structure to the human visual system and its ability to correctly classify objects in blended images. Our findings showed that adding a self-attention mechanism reliably increases the similarity to the V4 area of the human ventral stream, an area where attention has a large influence on the processing of visual stimuli.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"Five 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129220602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Annadatha, M. Fridberg, S. Kold, O. Rahbek, M. Shen
{"title":"A Tool for Thermal Image Annotation and Automatic Temperature Extraction around Orthopedic Pin Sites","authors":"S. Annadatha, M. Fridberg, S. Kold, O. Rahbek, M. Shen","doi":"10.1109/IPAS55744.2022.10053084","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10053084","url":null,"abstract":"Existing annotation tools are mainly designed for visible images to support supervised learning problems for machine learning. A few tools exist for extracting temperature information from thermal images. However, they are time and manpower consuming, require different stages of data management, and are not automated. This paper focuses on addressing the limitation of existing tools in handling big thermal datasets for annotation, temperature distribution extraction in the Region of Interest (ROI) of Orthopedic surgical wounds and provides flexibility for a researcher to integrate thermal image analysis into wound care machine learning models. We present an easy to use research tool for one click annotation of Orthopedic pin sites for extraction of thermal information, which is a preliminary step of research to estimate the reliability of thermography for home based surveillance of post-operative infection. The proposed tool maps annotations from visible registered image onto thermal and radiometric images. Mapping these annotations from visible registered images avoids manual bias in annotating thermal images. Integrating the functionality of an annotation tool by processing thermal images to acquire single-click manual annotations and extracting temperature distributions in the ROI with those acquired annotations is the novelty of the proposed work and is also crucial for research on deep learning-based investigation on surgical wound infections.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129576278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arij Zouaoui, Ankur Mahtani, Mohamed Amine Hadded, S. Ambellouis, J. Boonaert, H. Wannous
{"title":"RailSet: A Unique Dataset for Railway Anomaly Detection","authors":"Arij Zouaoui, Ankur Mahtani, Mohamed Amine Hadded, S. Ambellouis, J. Boonaert, H. Wannous","doi":"10.1109/IPAS55744.2022.10052883","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052883","url":null,"abstract":"Understanding the driving environment is one of the key factors in achieving an autonomous vehicle. In particular, the detection of anomalies in the traffic lane is a high priority scenario, as it directly involves vehicle's safety. Recent state of the art image processing techniques for anomaly detection are all based on deep learning of neural networks. These algorithms require a considerable amount of annotated data for training and test purposes. While many datasets exist in the field of autonomous road vehicles, such datasets are extremely rare in the railway domain. In this work, we present a new innovative dataset relevant for railway anomaly detection called RailSet. It consists of 6600 high-quality manually annotated images containing normal situations and 1100 images of railway defects such as hole anomaly and rails discontinuity. Due to the lack of anomaly samples in public images and difficulties to create anomalies in the railway environment, we generate artificially images of abnormal scenes, using a deep learning algorithm named StyleMapGAN. This dataset is created as a contribution to the development of autonomous trains able to perceive tracks damage in front of the train. The dataset is available at this link.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126611495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Union Embedding and Backbone-Attention boost Zero-Shot Learning Model (UBZSL)","authors":"Ziyu Li","doi":"10.1109/IPAS55744.2022.10052972","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052972","url":null,"abstract":"Zero-Shot Learning (ZSL) aims to identify categories that are never seen during training. There are many ZSL methods available, and the number is steadily increasing. Even then, there are still some issues to be resolved, such as class embedding and image functions. Human-annotated attributes have been involved in recent work on class embedding. However, this type of attribute does not adequately represent the semantic and visual aspects of each class, and these annotating attributes are time-consuming. Furthermore, ZSL methods for extracting image features rely on the development of pre-trained image representations or fine-tuned models, focusing on learning appropriate functions between image representations and attributes. To reduce the dependency on manual annotation and improve the classification effectiveness, we believe that ZSL would benefit from using Contrastive Language-Image Pre-Training (CLIP) or combined with manual annotation. For this purpose, we propose an improved ZSL model named UBZSL. It uses CLIP combined with manual annotation as a class embedding method and uses an attention map for feature extraction. Experiments show that the performance of our ZSL model on the CUB dataset is greatly improved compared to the current model.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134603708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Paper Review Samples","authors":"","doi":"10.1109/IPAS55744.2022.10052807","DOIUrl":"https://doi.org/10.1109/IPAS55744.2022.10052807","url":null,"abstract":" 1. Is the paper relevant to the conference topics? o very relevant 2. Is there any originality of the presented work? (5: high originality, ... 1: no originality) o (5) 3. How can you rate the structure of the paper? (5: well, ..., 1: poor) o (4) 4. How do you rate the appropriateness of the research/study method? ( 5: excellent,..., 1:poor) o (4) 5. How do you rate the relevance and clarity of drawings, figures and tables? ( 5: excellent, 1: poor) o (4) 6. How do you rate the appropriateness of the abstract as a description of the paper? ( 5: excellent, ..., 1:poor) o (4) 7. Are references adequate? recent? and correctly cited? (5: excellent,..., 1:poor) o (4) 8. Are discussions and conclusions appropriate? (5: excellent, ..., 1: poor) o (4) 9. Please, add some comments on the paper if you have any. o The paper is well written. Authors address the problem of audio signal augmentation based on Trans-GAN.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115702267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}