{"title":"Comparison Between Different Machine Learning Algorithms","authors":"Souvik Sarkar, Sidhant Singh, Rohit Kumar, Debraj Chatterjee","doi":"10.1109/ICIIP53038.2021.9702541","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702541","url":null,"abstract":"Machine learning is a major application of AI it is a phenomena by which system automatically learn and improve from real world experience in form of data on which it is trained or by observing the surroundings. Machine learning is changing the planet by transforming all segments including healthcare services, education, transport, food, entertainment, and different production line and lots of more. Machine learning plays a great role in changing our lives as well as industries in sector like housing and applications, cars, shopping, food ordering, etc. One of the application of ML is character recognition. Character recognition is used to convert handwritten or printed documents or image characters of document to computer codes. The basic process involves examining the text of a document and translating the characters into code that can be used for data processing. Hardware, like an optical scanner or specialized circuit card is employed to repeat or read text while software typically handles the advanced processing. Software can also take advantage of machine learning algorithms to implement more advanced methods of intelligent character recognition, like identifying languages or styles of handwriting. In this era there are various machine Learning algorithms like a linear regression, Convolutional Neural Network(CNN), kNN, K-Means, Random Forest etc. In this paper we are going to compare the results between between linear regression and Convolution Neural Network(CNN) over a same data-set of 0-9 digits.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132769981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recent Trends in Text Region Identification and Localization in Native Surrounding Images","authors":"R. Devi, B. Kumar","doi":"10.1109/ICIIP53038.2021.9702660","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702660","url":null,"abstract":"Vision-based content analysis and understanding is a crucial aspect in the field of machine vision and combining it with machine learning-based techniques improves the system’s efficiency and accuracy significantly. Identification and Localization of the Textual information in scene images is one of the demanding issues in this field. And it is utilized in a variety of recent growing vision-based applications that includes robot navigation, language translation, and industrial automation. Because textual information is dispersed across an image and has no prior knowledge of its location, it a is tiresome process to identify and localize text instances in images acquired in an uncontrolled environment. In this paper, we have analyzed and summarized the recent breakthroughs in this domain that aim to resolve the complications inherent with it using classic and modern methodologies. Also, discuss the available benchmarked datasets and evaluation procedures for measuring the effectiveness of the reported approaches. Finally, we addressed the upcoming research direction towards the mentioned domain.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133405363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Crop Yield Prediction of Indian Districts Using Deep Learning","authors":"Parjanya Prashant, Kaustubh Ponkshe, Chirag Garg, Ishan Pendse, Prathamesh Muley","doi":"10.1109/ICIIP53038.2021.9702573","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702573","url":null,"abstract":"The uncertain yield of crops is one of the major problems the agricultural sector faces today, especially in India. The objective of this paper is to provide an accurate and reliable prediction of crop yield. This will help farmers make decisions that can make their farming more efficient and profitable. We propose a novel deep learning model - an ensemble neural network model using Long Short-Term Memory (LSTMs) and one-dimensional Convolutional Neural Networks (CNNs). We used crop data for over 30 crops from 1997-2015 of all Indian districts. Our model substantially outperforms all other models (Linear Regression, Random Forest, extreme Gradient Boosting (XGB) Regressor, Feed-forward Neural Network (FFNN)) that were tested on accuracy in predicting crop yields. We achieve a correlation coefficient value of over 0.90 and 0.92 for our model for train and test datasets.Our model has several advantages compared to other models. Firstly, it is able to capture the time dependency on temperature and rainfall. Secondly, it is able to work on a large and diverse dataset, unlike most models which only perform well in small regions. Lastly, it is able to use several diverse features - geographical, social, and economic to make a prediction.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129962498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Facial Expression Recognition in Videos by learning Spatio-Temporal Features with Deep Neural Networks","authors":"Priyanka A. Gavade, Vandana S. Bhat, J. Pujari","doi":"10.1109/ICIIP53038.2021.9702545","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702545","url":null,"abstract":"Face expression recognition in videos is one of the most challenging research topics in the field of Computer vision. With the advancements in Deep Learning and promising results of Deep Neural Networks, a significant improvement in the performance of the emotion recognition system is observed. This paper first presents a fusion feature extraction approach that involves extracting and combining high-level temporal and spatial features from the video sequences. Second, the learned visual features are input to a Hybrid classifier, i.e., combination of Convolution Neural Network (CNN) and Long short-term memory (LSTM) recurrent neural network, to identify human expressions automatically. Later, hybrid Alex Net-LSTM, VGG-LSTM, Resnet-LSTM, and inception V2-LSTM classifiers are trained on RAVDESS, SAVEE and AFEW databases. The classification result of the proposed method has been compared with other models in which the same datasets for video emotion recognition were used. The proposed method obtains the recognition accuracy of 97.6%, 97.1%, and 95.0% for datasets, such as SAVEE, RAVDESS, and AFEW, respectively.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129988287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sowrabh S, Ashith Rameshkumar, Athira P V, J. B, Anish M N
{"title":"An Intelligent Robot Assisting Medical Practitioners To Aid Potential Covid-19 Patients","authors":"Sowrabh S, Ashith Rameshkumar, Athira P V, J. B, Anish M N","doi":"10.1109/ICIIP53038.2021.9702538","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702538","url":null,"abstract":"The purpose of this paper is to develop a robot-nurse capable of assisting human health professionals in a COVID-19 isolation unit such that health care professionals may discontinue wearing the personal protective kit that they previously wore when dealing with patients at isolation units. The robot can perform most of the tasks carried out by human nurses in a hospital. It can take a patient's temperature without touching them, administer medicines at the appropriate time, assist in sanitizing the individual and their hands, and offer a section for sterilizing medical equipment. [14]","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124622023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Palm-print recognition based on image quality and texture features with neural network","authors":"Poonam Poonia, P. Ajmera","doi":"10.1109/ICIIP53038.2021.9702670","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702670","url":null,"abstract":"Biometric is the science of validating the integrity of human’s based on their physiological or behavioural attributes. Different biometric traits like retina, fingerprint, ear, face, palm-prints are broadly utilized for person authentication and user access. Palm-print as a biometric have attracted much research attention in various security applications. This paper presents the use of the convolutional neural network (CNN) combined with Gabor filter that extract highly discriminative features. An image quality module is applied to get the good quality images. Gabor filter having various scales and orientations is employed to extract the texture information of ROIs. The use of texture descriptor with CNN strengthens the learning of texture information. Experiments are conducted on the CASIA and IIT-Delhi touchless palm-print databases. The method yields an accuracy of 98.69% and Equal Error Rate (EER) of 0.62% on CASIA database. The experimental result demonstrates the superiority of the proposed method over the current methods.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130603971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conductivity Based Agglomerative Spectral Clustering for Community Detection","authors":"M. T, G. Sajeev","doi":"10.1109/ICIIP53038.2021.9702554","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702554","url":null,"abstract":"Community detection has become a popular topic in the field of network analysis and one of the most popular methods is spectral clustering. There have been a number of proposals and research with varying degrees of success. However, SpectralClustering is considered to be most robust and accurate approach. Clustering implementations have historically been done on artificial neural network. Nevertheless, accuracy is understandably not as good. In this paper, we present a community detection method using agglomerate spectral clustering. Our method uses conductance and edge weights to achieve a higher level of similarity, based on eigenvector space. The conductance method is employed for finding the clusters of nodes that are not connected. The proposed method is validated with large network graph of Live Journal. We compare our method with agglomerative hierarchical kernel spectral clustering (AH-KSC). It is observed that, conductance based agglomerative method gives better results in terms of accuracy, precision and recall.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122214367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SmartIdOCR: Automatic Detection and Recognition of Identity card number using Deep Networks","authors":"M. Gupta, Ronak Shah, Jitesh Rathod, Ajai Kumar","doi":"10.1109/ICIIP53038.2021.9702703","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702703","url":null,"abstract":"Identity authentication is much needed and required in this digital age where the information can be utilized in many areas like banking, finance, insurance, education, etc. The long time in the manual authentication process is tiresome for both sides due to the exchange of data. The challenge lies in verification and information extraction from the ID card during the authentication process. There is an AI-based solution needed to reduce the authentication time. This paper aims to solve this problem by doing real-time authentication of identity cards like PAN and UIDAI using AI techniques with good accuracy. Real-time authentication is done by text detection and text recognition. The text detection is done using a differentiable binarization algorithm. We do not have a real annotated dataset for an ID number. We generated approximately 90000 identity number images synthetically with noise and blur using two fonts. This dataset is divided into training, validation, and testing sets. We present a neural encoder-decoder model with attention for converting ID number line images into editable text. Our method is evaluated based on the text output of the line image. An attention-based approach can tackle this problem in a better way in comparison to other neural techniques using CTC-based models. This paper describes the usage of OpenNMT architecture for recognition due to the flexibility of hyperparameter tuning. We evaluated the text recognition performance on scanned as well as the camera-captured identity card number images. We also compared the current recognition performance of OpenNMT with Tesseract (LSTM) on the same testbed containing 36000 images containing ID numbers only. The proposed approach outperformed the Tesseract in ID number recognition.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116604526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explainable Custom CNN Architecture for Land Use Classification using Satellite Images","authors":"Muskan Verma, Nayan Gupta, Bhavishya Tolani, Rishabh Kaushal","doi":"10.1109/ICIIP53038.2021.9702698","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702698","url":null,"abstract":"Satellite images can be very challenging to work with which impedes their usage for numerous purposes. They are crucial for tracking the ever-changing human footprint around the world, including fast-growing cities, urban spread, and informal settlements. In this paper, we propose a custom CNN model that classifies the EuroSAT Dataset (captured by Sentinel-2 satellite) consisting of 10 classes having 27,000 geo-referenced labeled images. Our custom model achieves an average accuracy of 88.21%. Furthermore, we leverage a well-known explainability approach (LIME) to help us understand the reasons for model predictions. We categorize the ten classes into four land type categories, namely, agricultural, build-up, under-developed, and water bodies. We perform a detailed comparative explainability study using Local Interpretable Model-Agnostic Explanations (LIME). For agricultural areas, the model selects the correct regions for predictions. For build-up areas, man-made structures like buildings are important factors for classification. For underdeveloped regions (forest cover), the model considers only a portion of green areas as important. For water bodies, model does not consider all parts of water as important for prediction.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132428527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. S. Satyanarayana Reddy, Ashwani Kumar, M. Jyaram
{"title":"An Image Sentence Generation Based on Deep Neural Network Using RCNN-LSTM Model","authors":"S. S. Satyanarayana Reddy, Ashwani Kumar, M. Jyaram","doi":"10.1109/ICIIP53038.2021.9702685","DOIUrl":"https://doi.org/10.1109/ICIIP53038.2021.9702685","url":null,"abstract":"Image captioning refers to generating a sentence description by analyzing the image. The objective of image captioning is to automatically generate these captions for an image to gain a deeper knowledge by using deep learning algorithms. In this paper, an image sentence generation based on deep neural network using RCNN-LSTM model is proposed. In the proposed model an image is taken as input and generate sentence as an output by making use of natural language processing for describing the contents of the image. We have developed this model by consistently analyzing a deep neural networks and image sentence generation methodologies. The scheme uses image datasets and their sentence descriptions to train and test the model and have balance between language and visual data. This research paper uses Recurrent Convolutional Neural Networks (RCNN) a combination of recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). RNN is used to process language part and CNN is used to process image part for obtaining feature vectors. Additionally, Long Short Term Memory (LSTM) is used for textual sentence generation. In this model, RCNN works as an encoder to retrieve features for the images by making use of Keras VGG16 and LSTM works as a decoder to obtain textual sentences which describes the images. In our approach we have used Flickr-8k, Flickr30K and MSCOCO dataset to train the model. The model image sentence generation achieved a very good accuracy for generating captions.","PeriodicalId":431272,"journal":{"name":"2021 Sixth International Conference on Image Information Processing (ICIIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134537129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}