{"title":"Improved Framework using Rider Optimization Algorithm for Precise Image Caption Generation","authors":"Chaitrali Prasanna Chaudhari, S. Devane","doi":"10.1142/s0219467822500218","DOIUrl":"https://doi.org/10.1142/s0219467822500218","url":null,"abstract":"“Image Captioning is the process of generating a textual description of an image”. It deploys both computer vision and natural language processing for caption generation. However, the majority of the image captioning systems offer unclear depictions regarding the objects like “man”, “woman”, “group of people”, “building”, etc. Hence, this paper intends to develop an intelligent-based image captioning model. The adopted model comprises of few steps like word generation, sentence formation, and caption generation. Initially, the input image is subjected to the Deep learning classifier called Convolutional Neural Network (CNN). Since the classifier is already trained in the relevant words that are related to all images, it can easily classify the associated words of the given image. Further, a set of sentences is formed with the generated words using Long-Short Term Memory (LSTM) model. The likelihood of the formed sentences is computed using the Maximum Likelihood (ML) function, and the sentences with higher probability are taken, which is further used for generating the visual representation of the scene in terms of image caption. As a major novelty, this paper aims to enhance the performance of CNN by optimally tuning its weight and activation function. This paper introduces a new enhanced optimization algorithm Rider with Randomized Bypass and Over-taker update (RR-BOU) for this optimal selection. In the proposed RR-BOU is the enhanced version of the Rider Optimization Algorithm (ROA). Finally, the performance of the proposed captioning model is compared over other conventional models with respect to statistical analysis.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130611852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal Video Indexing (MVI): A New Method Based on Machine Learning and Semi-Automatic Annotation on Large Video Collections","authors":"Mohamed Hamroun, K. Tamine, B. Crespin","doi":"10.1142/s021946782250022x","DOIUrl":"https://doi.org/10.1142/s021946782250022x","url":null,"abstract":"Indexing video by the concept is one of the most appropriate solutions for such problems. It is based on an association between a concept and its corresponding visual sound, or textual features. This kind of association is not a trivial task. It requires knowledge about the concept and its context. In this paper, we investigate a new concept detection approach to improve the performance of content-based multimedia documents retrieval systems. To achieve this goal, we are going to tackle the problem from different plans and make four contributions at various stages of the indexing process. We propose a new method for multimodal indexation based on (i) a new weakly supervised semi-automatic method based on the genetic algorithm (ii) the detection of concepts from the text in the videos (iii) the enrichment of the basic concepts thanks to the usage of our method DCM. Subsequently, the semantic and enriched concepts allow a better multimodal indexation and the construction of an ontology. Finally, the different contributions are tested and evaluated on a large dataset (TRECVID 2015).","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134480533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Colon Cancer Detection Using Hybrid Features and Genetically Optimized Neural Network Classifier","authors":"Samrat P. Khadilkar","doi":"10.1142/s0219467822500243","DOIUrl":"https://doi.org/10.1142/s0219467822500243","url":null,"abstract":"Computer-assisted colon cancer detection on the histopathological images has become a tedious task due to its shape characteristics and other biological properties. The images acquired through the histopathological microscope may vary in magnifications for better visibility. This may change the morphological properties and hence an automated magnification independent colon cancer detection system is essential. The manual diagnosis of colon biopsy images is subjective, sluggish, laborious leading to nonconformity between histopathologists due to visual evaluation at various microscopic magnifications. Automatic detection of colon across image magnifications is challenging due to many aspects like tailored segmentation and varying features. This demands techniques that take advantage of the textural, color, and geometric properties of colon tissue. This work exhibits a segmentation approach based on the morphological features derived from the segmented region. Gabor Wavelet, Harris Corner, and DWT-LBP coefficients are extracted as it should not be dependent on the spatial domain with respect to the magnification. These features are fed to the Genetically Optimized Neural Network classifier to classify them as normal and malignant ones. Here, the genetic algorithm is used to learn the best hyper-parameters for a neural network.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116943822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reddy Mounika Bommisetty, A. Khare, M. Khare, P. Palanisamy
{"title":"Content-Based Video Retrieval Using Integration of Curvelet Transform and Simple Linear Iterative Clustering","authors":"Reddy Mounika Bommisetty, A. Khare, M. Khare, P. Palanisamy","doi":"10.1142/s0219467822500188","DOIUrl":"https://doi.org/10.1142/s0219467822500188","url":null,"abstract":"Video is a rich information source containing both audio and visual information along with motion information embedded in it. Applications such as e-learning, live TV, video on demand, traffic monitoring, etc. need an efficient video retrieval strategy. Content-based video retrieval and superpixel segmentation are two diverse application areas of computer vision. In this work, we are presenting an algorithm for content-based video retrieval with help of Integration of Curvelet transform and Simple Linear Iterative Clustering (ICTSLIC) algorithm. Proposed algorithm consists of two steps: off line processing and online processing. In offline processing, keyframes of the database videos are extracted by employing features: Pearson Correlation Coefficient (PCC) and color moments (CM) and on the extracted keyframes superpixel generation algorithm ICTSLIC is applied. The superpixels generated by applying ICTSLIC on keyframes are used to represent database videos. On other side, in online processing, ICTSLIC superpixel segmentation is applied on query frame and the superpixels generated by segmentation are used to represent query frame. Then videos similar to query frame are retrieved through matching done by calculation of Euclidean distance between superpixels of query frame and database keyframes. Results of the proposed method are irrespective of query frame features such as camera motion, object’s pose, orientation and motion due to the incorporation of ICTSLIC superpixels as base feature for matching and retrieval purpose. The proposed method is tested on the dataset comprising of different categories of video clips such as animations, serials, personal interviews, news, movies and songs which is publicly available. For evaluation, the proposed method randomly picks frames from database videos, instead of selecting keyframes as query frames. Experiments were conducted on the developed dataset and the performance is assessed with different parameters Precision, Recall, Jaccard Index, Accuracy and Specificity. The experimental results shown that the proposed method is performing better than the other state-of-art methods.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128670520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modified Flower Pollination-based Segmentation of Medical Images","authors":"Kumaran Jayaraman, Koganti Srilakshmi, Sasikala Jayaraman","doi":"10.1142/s0219467822500267","DOIUrl":"https://doi.org/10.1142/s0219467822500267","url":null,"abstract":"This paper presents a modified flower pollination-based method for performing multilevel segmentation of medical images. The flower pollination-based optimization (FPO) models the pollination process of flowers. Bees serve a major role in the pollination activity of flowers and they memorize and recognize the best flowers producing large pollens of nectar. Such memorizing ability of bees is adapted in the FPO for improving the exploration ability of the algorithm. Besides, the mechanism of avoiding predators by pollinators is also included in the modified FPO (MFPO) for getting away from sub-optimal traps. The medical image segmentation problem is transformed into an optimization problem and solved using the modified FPO (MFPO). The method explores for optimal thresholds in the problem space of the given medical image. The segmented images are presented for showing the superior performance of the proposed method.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134414214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Image Privacy-Preserving Scheme Based On Mixed Chaotic Map and Compression","authors":"Muhammad Tanveer, T. Shah, Asif Ali, Dawood Shah","doi":"10.1142/S0219467822500206","DOIUrl":"https://doi.org/10.1142/S0219467822500206","url":null,"abstract":"In the digital modern era, multimedia security has turned into a major concern by the rapid growth of network technologies and digital communications. Accordingly, from the last few decades, the application of nonlinear dynamics and chaotic phenomena for multimedia data security earn significant attention. In this paper, an efficient image-encryption technique based on a two-dimensional (2D) chaotic system combine with the finite field of the specific order is introduced. The proposed scheme consists of four modules which are the separation of bits, compression, 2D chaotic map, and small S-boxes. Initially, the suggested scheme separates the pixels of the image into the least significant bits (LSB) and the most significant bits (MSB). Subsequently, the compression algorithm on these separated bits is applied and instantly transformed the MSB of the image into LSB. The key objective of the first module is to minimize the range of the pixel value up to eight times less than the original image, which consequently reduces the time complexity of the scheme. In the end, a 2D chaotic map is used to reshuffle the bytes to interrupt the internal correlation amongst the pixels of the image. At the tail end, the small S-boxes have been used to substitute the permuted image. The significance of small S-boxes plays a vital role to maintain the optimum security level, prevent computational effort, and reduced time complexity. The result of the suggested encryption system is tailor-made for instantaneous communication.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122860971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Brain Tumor Using MR Images: A Brief Survey","authors":"B. Shivaprasad, M. Ravikumar, D. Guru","doi":"10.1142/S0219467822500231","DOIUrl":"https://doi.org/10.1142/S0219467822500231","url":null,"abstract":"In this paper, we have discussed in detail about detection and extraction of brain tumor from MRI technique, where the importance of using MRI is also highlighted. Various features extraction methods and classifiers are explained in brain tumor segmentation. This paper mainly focuses on challenges involved in brain tumor analysis, which is helpful for researchers and those who are interested to carry out their research on this topic.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121886881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware Efficient Modified CNN Architecture for Traffic Sign Detection and Recognition","authors":"Bhaumik Vaidya, C. Paunwala","doi":"10.1142/S0219467822500176","DOIUrl":"https://doi.org/10.1142/S0219467822500176","url":null,"abstract":"Traffic sign recognition is a vital part for any driver assistance system which can help in making complex driving decision based on the detected traffic signs. Traffic sign detection (TSD) is essential in adverse weather conditions or when the vehicle is being driven on the hilly roads. Traffic sign recognition is a complex computer vision problem as generally the signs occupy a very small portion of the entire image. A lot of research is going on to solve this issue accurately but still it has not been solved till the satisfactory performance. The goal of this paper is to propose a deep learning architecture which can be deployed on embedded platforms for driver assistant system with limited memory and computing resources without sacrificing on detection accuracy. The architecture uses various architectural modification to the well-known Convolutional Neural Network (CNN) architecture for object detection. It uses a trainable Color Transformer Network (CTN) with the existing CNN architecture for making the system invariant to illumination and light changes. The architecture uses feature fusion module for detecting small traffic signs accurately. In the proposed work, receptive field calculation is used for choosing the number of convolutional layer for prediction and the right scales for default bounding boxes. The architecture is deployed on Jetson Nano GPU Embedded development board for performance evaluation at the edge and it has been tested on well-known German Traffic Sign Detection Benchmark (GTSDB) and Tsinghua-Tencent 100k dataset. The architecture only requires 11 MB for storage which is almost ten times better than the previous architectures. The architecture has one sixth parameters than the best performing architecture and 50 times less floating point operations per second (FLOPs). The architecture achieves running time of 220[Formula: see text]ms on desktop GPU and 578 ms on Jetson Nano which is also better compared to other similar implementation. It also achieves comparable accuracy in terms of mean average precision (mAP) for both the datasets.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131005642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Morphology Preserving Segmentation Method for Occluded Cell Nuclei from Medical Microscopy Image","authors":"Rafflesia Khan, R. Debnath","doi":"10.1142/S021946782250019X","DOIUrl":"https://doi.org/10.1142/S021946782250019X","url":null,"abstract":"Nowadays, image segmentation techniques are being used in many medical applications such as tissue culture monitoring, cell counting, automatic measurement of organs, etc., for assisting doctors. However, high-level segmentation results cannot be obtained without manual annotation or prior knowledge for high variability, noise and other imaging artifacts in medical images. Furthermore, unstable and continuously changing characteristics of all human cells, tissues and organs manipulate training-based segmentation methods. Detecting appropriate contour of a region of interest and single cells from overlapping condition are extremely challenging. In this paper, we aim for a model that can detect biological structure (e.g. cell nuclei and lung contour) with their proper morphology even in overlapping or occluded condition without manual annotation or prior knowledge. We have introduced a new optimal approach for automatic medical image region segmentation. The method first clearly focuses the boundaries of all object regions in a microscopy image. Then it detects the areas by following their contours. Our model is capable of detecting and segmenting object regions from medial image using less computation effort. Our experimental results prove that our model provides better detection on several datasets of different types of medical data and ensures more than 98% segmentation rate in the case of densely connected regions.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124641126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Metaheuristics Framework for Weighted Multi-band Image Fusion","authors":"Shaheera Rashwan, W. Sheta","doi":"10.1142/S0219467822500164","DOIUrl":"https://doi.org/10.1142/S0219467822500164","url":null,"abstract":"The main objective of hyper/multispectral image fusion is producing a composite color image that allows for an appropriate visualization of the relevant spatial and spectral information. In this paper, we propose a general framework for spectral weighting-based image fusion. The proposed methodology relies on weight updates conducted using nature-inspired algorithms and a goodness-of-fit criterion defined as the average root mean square error. Simulations on four public data sets and a recent Landsat 8 image of Brullus Lake, Egypt, as an area of study prove the efficiency of the proposed framework. The purpose of the study is to present a framework of multi-band image fusion that produces a fused image of high quality to be further used in computer processing and the results show that the image produced by the presented framework has the highest quality compared with some of the state-of-the art algorithms. To prove the increase in the image quality, we used general quality metrics such as Universal Image Quality Index, Mutual Information, the Variance and Information Measure.","PeriodicalId":177479,"journal":{"name":"Int. J. Image Graph.","volume":"96 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114025689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}